📚TheoryAdvanced

Manifold Learning

Key Points

•
Manifold learning assumes high-dimensional data actually lies near a much lower-dimensional, smoothly curved surface embedded in a higher-dimensional space.
•
Algorithms like Isomap, LLE, and Laplacian Eigenmaps recover low-dimensional coordinates by preserving geodesic or local neighborhood relationships.
•
A k-nearest neighbor graph approximates the manifold’s local geometry; shortest paths on this graph approximate geodesic distances.
•
Classical MDS on graph geodesic distances yields Isomap embeddings that preserve global geometry for well-sampled manifolds.
•
LLE computes reconstruction weights of each point from its neighbors and then finds low-dimensional coordinates that preserve those weights.
•
All these methods rely on spectral decomposition (eigenvectors of symmetric matrices), which is computationally heavy for large datasets.
•
Choosing k (neighbors) is critical: too small breaks connectivity; too large flattens curvature and violates the manifold assumption.
•
In C++, you can prototype manifold learning with k-NN graphs, Dijkstra’s algorithm, and a Jacobi eigen-solver for small datasets.

Prerequisites

→Linear algebra (eigenvalues, eigenvectors, symmetric matrices) — Manifold methods reduce to eigendecomposition of symmetric matrices and manipulate Gram and Laplacian matrices.
→Graph algorithms (k-NN, Dijkstra’s shortest paths) — Neighborhood graphs approximate manifold geometry and geodesic distances via shortest paths.
→Multivariate calculus and vector norms — Understanding distances, gradients, and smooth manifolds benefits from calculus and norm properties.
→Numerical computing and stability — Solving ill-conditioned systems and handling regularization is crucial for LLE and spectral steps.
→Probability and noise models — Helps reason about sampling density, noise effects, and robustness of local approximations.

Detailed Explanation

Tap terms for definitions

01Overview

Manifold learning is a family of techniques for finding low-dimensional structure hidden in high-dimensional data. The key idea is that while each data point may have hundreds or thousands of features, the actual degrees of freedom that generate the data are often much fewer. For example, images of a rotating object live in a huge pixel space, but the rotation can be described by a handful of parameters. Manifold learning methods try to “unfold” the data so that points that are close on the underlying curved surface (the manifold) are also close in the new low-dimensional coordinates.

Several popular algorithms operationalize this idea differently. Isomap builds a neighborhood graph, estimates geodesic (on-manifold) distances using shortest paths, and then applies classical multidimensional scaling (MDS) to preserve those distances globally. Locally Linear Embedding (LLE) preserves local linear relationships: each point is approximated as a weighted combination of its neighbors, and those weights are preserved in the embedding. Laplacian Eigenmaps preserves local adjacency by minimizing differences between neighbors in the embedding, guided by the graph Laplacian.

These methods are especially useful when Euclidean distances in the original space do not reflect true similarity due to curvature. They are unsupervised and rely heavily on linear algebra and spectral graph theory. Although powerful, they can be computationally expensive and sensitive to hyperparameters like the number of neighbors k and the choice of distance metric.

02Intuition & Analogies

Imagine walking around a hilly park (the manifold). If you measure your location by GPS coordinates (latitude, longitude, and altitude), that’s a 3D description. But your possible movements are constrained to the 2D surface of the park—you can’t tunnel straight through the hill. Your true degrees of freedom are just two: forward/backward and left/right along the ground. Manifold learning tries to recover this 2D map from many observed 3D GPS points.

Another analogy: a rolled-up poster (Swiss roll). The paper is intrinsically 2D, but when rolled it sits in 3D space. Two ants close on the paper might be far apart in 3D if the roll places them on different layers. If we measured straight-line (Euclidean) 3D distance, we’d think they’re far. But the ants can only walk along the paper, so the true distance is the shortest path along the surface (a geodesic). Isomap approximates this by connecting nearby points (ants that can step to each other) and using shortest paths on this graph as the “walkable” distance. After it estimates those distances, it flattens the roll by arranging the points in 2D so the walkable distances are preserved.

LLE uses a different local story. Picture each ant surrounded by a small group of friends. In the original rolled space, each ant can be approximated as a weighted average of its nearby friends because, in a tiny neighborhood, the surface looks almost flat (like a tangent plane). LLE finds these weights first. Then it looks for a low-dimensional arrangement where these same “friendship weights” still reconstruct each ant well. If that works, the local geometry is preserved, and the global shape emerges from stitching all the local neighborhoods together.

03Formal Definition

Let M be a smooth, compact, d-dimensional Riemannian manifold embedded in \(

R^{D}

\) with \(d

≪

D\). We observe samples \(X = \{

x_{i}

\}_{

i=1}^n

\subset

R^{D}

\) assumed to lie on or near M, possibly with noise. The goal is to find an embedding \(f: M

\to

R^{d}

\) (or its discrete approximation on X) such that intrinsic geometric relationships are preserved. - Isomap: Build a neighborhood graph G over X (e.g., k-NN). Define graph edge weights as Euclidean distances. Approximate geodesic distances \(

d_{M}

(

x_{i}

x_{j}

)\) by shortest-path distances \(

d_{G}

(i,j)\). Apply classical MDS to the matrix of squared geodesic distances to obtain coordinates \(Y

\in

R^{n \times d}

\) that preserve these distances in a least-squares sense. - LLE: For each \(

x_{i}

\), find neighbors \(

N

(i)\). Compute weights \(

w_{ij}

\) that reconstruct \(

x_{i}

\) from its neighbors by minimizing \(

\left

|x_i - \sum_{j \in \mathcal{N}(i)} w_{ij} x_j\right\|

_2^2\) subject to \(

\sum_{j}

w_{ij} = 1\

). Then find $Y$ minimizing \(

\sum_{i}

\left

|y_i - \sum_j w_{ij} y_j\right\|

_2^2\) under constraints like \(Y^

⊤

Y = I\

) and \(Y^

⊤

1

= 0\). This reduces to an eigenproblem for \(M = (I-W)^

⊤

(I-W)\). - Laplacian Eigenmaps: Build G with weights \(

w_{ij}

\) (e.g., heat kernel). Minimize \(

\sum_{i, j}

w_{ij}

∥

y_{i}

y_{j}

∥^{2}

= Y^

⊤

L Y\) subject to \(Y^

⊤

Y = I\

), where \(

L = D - W\

) is the graph Laplacian and D is the degree matrix.

04When to Use

Use manifold learning when your data is high dimensional but is likely governed by a few underlying parameters, and when Euclidean distances in the ambient space are misleading. Classic scenarios include: images of a single object under varying pose/lighting (few factors), sensor data from a robot moving in a constrained environment, speech snippets parameterized by a small number of articulatory variables, and scientific simulations exploring a limited parameter space.

Choose Isomap if you believe global, geodesic distances are meaningful and your manifold is well-sampled and mostly convex without many short-circuit edges; it captures global geometry well but can be sensitive to graph connectivity. Choose LLE when you want to preserve local linear structure and the manifold is smooth; it often performs well on data with strong local linearity and does not require computing all-pairs shortest paths. Choose Laplacian Eigenmaps when locality and smoothness are more important than global distances, especially as a preprocessing step for clustering or semi-supervised learning.

Avoid manifold learning if your data is highly noisy, has many disconnected regions, or if you need faithful extrapolation beyond the sampled manifold (these methods are generally non-parametric and do not extrapolate). For massive datasets, consider approximate neighbors, sparse solvers, or parametric variants (e.g., deep autoencoders) due to the spectral step’s cubic cost.

⚠️Common Mistakes

Picking k (neighbors) too small: The graph becomes disconnected, making geodesic distances infinite or embeddings fragmented. Always check connectivity and consider increasing k until the largest component includes most points.
Picking k too large: The graph adds “shortcuts” that jump across folds; geodesic distances collapse to Euclidean ones, destroying the manifold structure. Use domain knowledge and try a range of k while monitoring reconstruction error and visualization.
Ignoring scaling: Features with different scales distort distances and neighborhoods. Standardize features (zero-mean, unit-variance) or use a domain-appropriate metric.
Not regularizing LLE: Local covariance matrices can be ill-conditioned when k is near D or neighbors are nearly collinear. Add a small multiple of the trace to the diagonal to stabilize the weight computation.
Misinterpreting eigenvectors: In LLE and Laplacian Eigenmaps, the smallest eigenvector is often trivial (constant); skip it and take the next d eigenvectors. In MDS/Isomap, take the largest positive eigenvalues.
Using too few samples: Sparse sampling breaks local linear assumptions and graph approximations to geodesics. Ensure adequate density or use adaptive neighborhoods.
Expecting extrapolation: Non-parametric embeddings do not generalize to unseen points without an out-of-sample extension (e.g., Nystrom method or training a regressor on the embedding).

Key Formulas

Manifold Assumption

M \subset R^{D}, dim (M) = d ≪ D

Explanation: Data lies on or near a smooth, low-dimensional manifold embedded in a higher-dimensional ambient space. The intrinsic dimension d is much smaller than the ambient dimension D.

Noisy Samples

x_{i} = m_{i} + ε_{i}, m_{i} \in M, ε_{i} \sim noise

Explanation: Observed points are near the manifold points plus small noise. Many algorithms assume the noise is small so local geometry is preserved.

Geodesic Distance

d_{M} (p, q) = γ : [0, 1] \to M, γ (0) = p, γ (1) = q in f \int_{0}^{1} ∥ γ^{'} (t)∥ d t

Explanation: The manifold distance between two points is the length of the shortest path constrained to lie on the manifold. It reduces to straight-line distance on flat manifolds.

Graph Shortest Path

d_{G} (i, j) = paths i \to j min (u, v) \in path \sum w_{uv}

Explanation: Approximates geodesic distances using a neighborhood graph with edge weights given by local Euclidean distances. Dijkstra’s algorithm computes these efficiently for nonnegative weights.

MDS Double-Centering

J = I - \frac{1}{n} 1 1^{⊤}, B = - \frac{1}{2} J D^{(2)} J

Explanation: Given squared distances $D^{(2)}$ , this recovers the centered inner-product (Gram) matrix B. Eigenvectors of B scaled by the square roots of top eigenvalues give Euclidean coordinates.

LLE Weights

{w_{ij}} min i = 1 \sum n x_{i} - j \in N (i) \sum w_{ij} x_{j}_{2}^{2} s.t. j \sum w_{ij} = 1

Explanation: Each point is reconstructed from its neighbors with weights that sum to one. The solution uses the local covariance of neighbors with a small regularizer for stability.

LLE Embedding Matrix

M = (I - W)^{⊤} (I - W)

Explanation: After fixing reconstruction weights W, the embedding Y minimizing reconstruction error is given by the bottom nontrivial eigenvectors of M. The smallest eigenvector is constant and is discarded.

Laplacian Eigenmaps Objective

Y min i, j \sum w_{ij} ∥ y_{i} - y_{j} ∥^{2} = tr (Y^{⊤} L Y) s.t. Y^{⊤} D Y = I, Y^{⊤} D 1 = 0

Explanation: Encourages nearby points to stay close in the embedding via the graph Laplacian L = D - W. The solution uses the smallest nontrivial generalized eigenvectors of (L, D).

Isomap Complexity (rough)

Time_{Isomap} \approx O (n^{2} D) + O (n^{2} lo g n) + O (n^{3})

Explanation: Building k-NN by brute force is O( $n^{2}$ D), all-pairs shortest paths via repeated Dijkstra is about O( $n^{2}$ log n + $n^{2}$ k), and the spectral step (MDS) for n points is O( $n^{3}$ ).

LLE Complexity (rough)

Time_{LLE} \approx O (n^{2} D) + O (n k^{3}) + O (n^{3})

Explanation: k-NN search is O( $n^{2}$ D) by brute force, local weight solves are O( $k^{3}$ ) per point, and the eigen-decomposition of the dense n×n matrix is O( $n^{3}$ ).

Spectral Gap Heuristic

\hat{d} = ar g d max (λ_{d} - λ_{d + 1})

Explanation: Choose the embedding dimension d at a large drop in the eigenvalue spectrum. A prominent gap suggests the intrinsic dimensionality.

Complexity Analysis

Manifold learning pipelines consist of three dominant stages: neighborhood construction, a geometric/spectral core, and an eigendecomposition. With naive (brute-force) k-NN, computing all pairwise distances costs O(

n^{2}

D) time and O(

n^{2}

) space to store them. For moderate D and up to a few thousand points, this is practical; beyond that, approximate neighbors (e.g., KD-trees, ball trees, or ANN libraries) reduce the cost but complicate implementation. For Isomap, after building the k-NN graph with roughly m ≈ nk edges, we approximate geodesic distances by running Dijkstra from each node. Using a binary heap, one source is O(m log n) ≈ O(nk log n); repeating for all sources is O(

n^{2}

k log n). Storing the distance matrix is O(

n^{2}

). Next, classical MDS requires building the centered Gram matrix and performing an eigendecomposition of an n×n symmetric matrix. Dense eigensolvers are O(

n^{3}

) time and O(

n^{2}

) space; Jacobi’s method used in small C++ demos is simple but also cubic and slower than optimized libraries. For LLE, the local weight computation per point solves a k×k linear system, costing O(

k^{3}

) per point for Gaussian elimination and O(

k^{2}

) space. Across all points this is O(n

k^{3}

). Constructing M = (I − W)^T(I − W) densely is O(

n^{3}

) if done naively; exploiting sparsity reduces this to approximately O(n

k^{2}

). The final spectral step again dominates at O(

n^{3}

) with O(

n^{2}

) memory. In summary, for exact, naive implementations: Isomap ≈ O(

n^{2}

D +

n^{2}

k log n +

n^{3}

) time and O(

n^{2}

) space; LLE ≈ O(

n^{2}

D + n

k^{3}

n^{3}

) time and O(

n^{2}

) space. Thus, these methods scale poorly with n; use sparse matrices, iterative eigensolvers (Lanczos), and approximate neighbors for large datasets.

Code Examples

Isomap: k-NN graph, Dijkstra geodesics, and classical MDS (Jacobi eigen) on a Swiss roll

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Simple Jacobi eigenvalue algorithm for symmetric matrices
5 struct EigenResult {
6     vector<double> values;           // eigenvalues
7     vector<vector<double>> vectors;  // columns are eigenvectors
8 };
9 
10 static pair<int,int> maxOffDiag(const vector<vector<double>>& A) {
11     int n = (int)A.size();
12     int p = 0, q = 1;
13     double m = 0.0;
14     for (int i = 0; i < n; ++i) {
15         for (int j = i+1; j < n; ++j) {
16             double v = fabs(A[i][j]);
17             if (v > m) { m = v; p = i; q = j; }
18         }
19     }
20     return {p,q};
21 }
22 
23 static EigenResult jacobiEigen(vector<vector<double>> A, int maxIter = 10000, double eps = 1e-10) {
24     int n = (int)A.size();
25     vector<vector<double>> V(n, vector<double>(n, 0.0));
26     for (int i = 0; i < n; ++i) V[i][i] = 1.0; // identity
27 
28     for (int iter = 0; iter < maxIter; ++iter) {
29         auto [p, q] = maxOffDiag(A);
30         if (fabs(A[p][q]) < eps) break; // converged
31         double app = A[p][p];
32         double aqq = A[q][q];
33         double apq = A[p][q];
34         double phi = 0.5 * atan2(2.0 * apq, aqq - app);
35         double c = cos(phi), s = sin(phi);
36 
37         // Rotate A
38         for (int i = 0; i < n; ++i) {
39             if (i == p || i == q) continue;
40             double aip = A[i][p];
41             double aiq = A[i][q];
42             A[i][p] = c * aip - s * aiq;
43             A[p][i] = A[i][p];
44             A[i][q] = s * aip + c * aiq;
45             A[q][i] = A[i][q];
46         }
47         double new_app = c*c*app - 2.0*s*c*apq + s*s*aqq;
48         double new_aqq = s*s*app + 2.0*s*c*apq + c*c*aqq;
49         A[p][p] = new_app;
50         A[q][q] = new_aqq;
51         A[p][q] = 0.0;
52         A[q][p] = 0.0;
53 
54         // Rotate V
55         for (int i = 0; i < n; ++i) {
56             double vip = V[i][p];
57             double viq = V[i][q];
58             V[i][p] = c * vip - s * viq;
59             V[i][q] = s * vip + c * viq;
60         }
61     }
62     vector<double> evals(n);
63     for (int i = 0; i < n; ++i) evals[i] = A[i][i];
64 
65     // Sort eigenvalues descending with corresponding eigenvectors
66     vector<int> idx(n);
67     iota(idx.begin(), idx.end(), 0);
68     sort(idx.begin(), idx.end(), [&](int a, int b){ return evals[a] > evals[b]; });
69     vector<double> sortedVals(n);
70     vector<vector<double>> sortedVecs(n, vector<double>(n));
71     for (int k = 0; k < n; ++k) {
72         sortedVals[k] = evals[idx[k]];
73         for (int i = 0; i < n; ++i) sortedVecs[i][k] = V[i][idx[k]];
74     }
75     return {sortedVals, sortedVecs};
76 }
77 
78 static double sqr(double x){ return x*x; }
79 
80 // Generate Swiss roll (n points)
81 vector<vector<double>> swissRoll(int n, unsigned seed=42) {
82     mt19937 rng(seed);
83     uniform_real_distribution<double> U(1.5*M_PI, 4.5*M_PI);
84     uniform_real_distribution<double> H(0.0, 10.0);
85     vector<vector<double>> X(n, vector<double>(3));
86     for (int i = 0; i < n; ++i) {
87         double t = U(rng);
88         double h = H(rng);
89         double x = t * cos(t);
90         double y = h;
91         double z = t * sin(t);
92         X[i][0] = x; X[i][1] = y; X[i][2] = z;
93     }
94     return X;
95 }
96 
97 // Euclidean distance between rows i and j
98 static double dist(const vector<vector<double>>& X, int i, int j) {
99     double s = 0.0; int D = (int)X[0].size();
100     for (int d = 0; d < D; ++d) s += sqr(X[i][d] - X[j][d]);
101     return sqrt(s);
102 }
103 
104 // Build k-NN graph (symmetric) with Euclidean weights
105 vector<vector<pair<int,double>>> buildKNNGraph(const vector<vector<double>>& X, int k) {
106     int n = (int)X.size();
107     vector<vector<pair<int,double>>> adj(n);
108     for (int i = 0; i < n; ++i) {
109         vector<pair<double,int>> cand;
110         cand.reserve(n-1);
111         for (int j = 0; j < n; ++j) if (i != j) cand.push_back({dist(X,i,j), j});
112         nth_element(cand.begin(), cand.begin()+k, cand.end());
113         sort(cand.begin(), cand.begin()+k); // ensure deterministic neighbors
114         for (int t = 0; t < k; ++t) {
115             int j = cand[t].second;
116             double w = cand[t].first;
117             adj[i].push_back({j, w});
118             adj[j].push_back({i, w}); // make symmetric
119         }
120     }
121     // Optional: deduplicate symmetric edges
122     for (int i = 0; i < n; ++i) {
123         sort(adj[i].begin(), adj[i].end());
124         adj[i].erase(unique(adj[i].begin(), adj[i].end(), [](auto&a, auto&b){return a.first==b.first;}), adj[i].end());
125     }
126     return adj;
127 }
128 
129 // Dijkstra from a single source
130 vector<double> dijkstra(const vector<vector<pair<int,double>>>& adj, int s) {
131     int n = (int)adj.size();
132     const double INF = 1e100;
133     vector<double> dist(n, INF);
134     dist[s] = 0.0;
135     using P = pair<double,int>;
136     priority_queue<P, vector<P>, greater<P>> pq;
137     pq.push({0.0, s});
138     while (!pq.empty()) {
139         auto [d,u] = pq.top(); pq.pop();
140         if (d > dist[u]) continue;
141         for (auto [v,w] : adj[u]) {
142             if (dist[v] > d + w) {
143                 dist[v] = d + w;
144                 pq.push({dist[v], v});
145             }
146         }
147     }
148     return dist;
149 }
150 
151 // Classical MDS from distance matrix to d dimensions
152 vector<vector<double>> classicalMDS(const vector<vector<double>>& D, int d) {
153     int n = (int)D.size();
154     // Build squared distance matrix
155     vector<vector<double>> S(n, vector<double>(n));
156     for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) S[i][j] = sqr(D[i][j]);
157 
158     // Double centering: B = -0.5 * J S J
159     vector<double> rowMean(n, 0.0), colMean(n, 0.0);
160     double totalMean = 0.0;
161     for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) rowMean[i] += S[i][j];
162     for (int i = 0; i < n; ++i) rowMean[i] /= n;
163     for (int j = 0; j < n; ++j) {
164         for (int i = 0; i < n; ++i) colMean[j] += S[i][j];
165         colMean[j] /= n;
166     }
167     for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) totalMean += S[i][j];
168     totalMean /= (n * n);
169 
170     vector<vector<double>> B(n, vector<double>(n));
171     for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j)
172         B[i][j] = -0.5 * ( S[i][j] - rowMean[i] - colMean[j] + totalMean );
173 
174     // Eigen-decomposition of symmetric B
175     EigenResult er = jacobiEigen(B);
176 
177     // Take top d positive eigenvalues/eigenvectors
178     vector<vector<double>> Y(n, vector<double>(d, 0.0));
179     int taken = 0;
180     for (int k = 0; k < (int)er.values.size() && taken < d; ++k) {
181         double lambda = er.values[k];
182         if (lambda <= 0) continue; // skip nonpositive
183         double sroot = sqrt(lambda);
184         for (int i = 0; i < n; ++i) Y[i][taken] = er.vectors[i][k] * sroot;
185         ++taken;
186     }
187     // If not enough positive eigenvalues, remaining columns stay zero
188     return Y;
189 }
190 
191 int main(){
192     ios::sync_with_stdio(false);
193     cin.tie(nullptr);
194 
195     int n = 120;          // number of points
196     int k = 8;            // neighbors
197     int outDim = 2;       // target dimension
198 
199     // 1) Data
200     auto X = swissRoll(n);
201 
202     // 2) Build k-NN graph
203     auto adj = buildKNNGraph(X, k);
204 
205     // 3) All-pairs shortest paths (geodesic distances)
206     vector<vector<double>> G(n, vector<double>(n, 0.0));
207     for (int i = 0; i < n; ++i) {
208         auto di = dijkstra(adj, i);
209         G[i] = di;
210     }
211 
212     // 4) Classical MDS to obtain embedding
213     auto Y = classicalMDS(G, outDim);
214 
215     // 5) Output 2D coordinates
216     cout << fixed << setprecision(6);
217     for (int i = 0; i < n; ++i) {
218         cout << Y[i][0] << ", " << Y[i][1] << "\n";
219     }
220     return 0;
221 }
222

This program demonstrates Isomap. It generates a 3D Swiss roll, builds a symmetric k-nearest neighbor graph with Euclidean edge weights, and computes approximate geodesic distances between all pairs by running Dijkstra from each node. It then applies classical MDS by double-centering the squared geodesic distances to produce a Gram matrix, performs a Jacobi eigendecomposition, and takes the top d eigenvectors scaled by the square roots of their eigenvalues as the 2D embedding. The Jacobi eigen-solver is simple and adequate for small n, making the example self-contained without external libraries.

Time: O(n^2 D) for brute-force k-NN + O(n^2 k log n) for all-pairs Dijkstra + O(n^3) for the eigendecomposition (dominant for larger n).Space: O(n^2) to store the distance matrix and Gram matrix; O(nk) for the graph.

Locally Linear Embedding (LLE): weights, (I−W)^T(I−W), and bottom eigenvectors (Jacobi)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct EigenResult {
5     vector<double> values;           // eigenvalues
6     vector<vector<double>> vectors;  // columns are eigenvectors
7 };
8 
9 static pair<int,int> maxOffDiag(const vector<vector<double>>& A) {
10     int n = (int)A.size();
11     int p = 0, q = 1; double m = 0.0;
12     for (int i = 0; i < n; ++i) for (int j = i+1; j < n; ++j) {
13         double v = fabs(A[i][j]);
14         if (v > m) { m = v; p = i; q = j; }
15     }
16     return {p,q};
17 }
18 
19 static EigenResult jacobiEigen(vector<vector<double>> A, int maxIter = 10000, double eps = 1e-10) {
20     int n = (int)A.size();
21     vector<vector<double>> V(n, vector<double>(n, 0.0));
22     for (int i = 0; i < n; ++i) V[i][i] = 1.0;
23     for (int iter = 0; iter < maxIter; ++iter) {
24         auto [p, q] = maxOffDiag(A);
25         if (fabs(A[p][q]) < eps) break;
26         double app = A[p][p], aqq = A[q][q], apq = A[p][q];
27         double phi = 0.5 * atan2(2.0*apq, aqq - app);
28         double c = cos(phi), s = sin(phi);
29         for (int i = 0; i < n; ++i) if (i != p && i != q) {
30             double aip = A[i][p], aiq = A[i][q];
31             A[i][p] = c*aip - s*aiq; A[p][i] = A[i][p];
32             A[i][q] = s*aip + c*aiq; A[q][i] = A[i][q];
33         }
34         double new_app = c*c*app - 2*s*c*apq + s*s*aqq;
35         double new_aqq = s*s*app + 2*s*c*apq + c*c*aqq;
36         A[p][p] = new_app; A[q][q] = new_aqq;
37         A[p][q] = 0.0; A[q][p] = 0.0;
38         for (int i = 0; i < n; ++i) {
39             double vip = V[i][p], viq = V[i][q];
40             V[i][p] = c*vip - s*viq; V[i][q] = s*vip + c*viq;
41         }
42     }
43     vector<double> evals(n);
44     for (int i = 0; i < n; ++i) evals[i] = A[i][i];
45     // Sort ascending (for LLE we need smallest eigenvalues)
46     vector<int> idx(n); iota(idx.begin(), idx.end(), 0);
47     sort(idx.begin(), idx.end(), [&](int a, int b){ return evals[a] < evals[b]; });
48     vector<double> sortedVals(n);
49     vector<vector<double>> sortedVecs(n, vector<double>(n));
50     for (int k = 0; k < n; ++k) {
51         sortedVals[k] = evals[idx[k]];
52         for (int i = 0; i < n; ++i) sortedVecs[i][k] = V[i][idx[k]];
53     }
54     return {sortedVals, sortedVecs};
55 }
56 
57 static double sqr(double x){ return x*x; }
58 
59 // Generate Swiss roll
60 vector<vector<double>> swissRoll(int n, unsigned seed=1337) {
61     mt19937 rng(seed);
62     uniform_real_distribution<double> U(1.5*M_PI, 4.5*M_PI);
63     uniform_real_distribution<double> H(0.0, 10.0);
64     vector<vector<double>> X(n, vector<double>(3));
65     for (int i = 0; i < n; ++i) {
66         double t = U(rng);
67         double h = H(rng);
68         X[i][0] = t * cos(t);
69         X[i][1] = h;
70         X[i][2] = t * sin(t);
71     }
72     return X;
73 }
74 
75 static double euclid(const vector<vector<double>>& X, int i, int j) {
76     double s = 0.0; int D = (int)X[0].size();
77     for (int d = 0; d < D; ++d) s += sqr(X[i][d] - X[j][d]);
78     return sqrt(s);
79 }
80 
81 vector<vector<int>> knnIndices(const vector<vector<double>>& X, int k) {
82     int n = (int)X.size();
83     vector<vector<int>> N(n);
84     for (int i = 0; i < n; ++i) {
85         vector<pair<double,int>> cand; cand.reserve(n-1);
86         for (int j = 0; j < n; ++j) if (i != j) cand.push_back({euclid(X,i,j), j});
87         nth_element(cand.begin(), cand.begin()+k, cand.end());
88         sort(cand.begin(), cand.begin()+k);
89         N[i].resize(k);
90         for (int t = 0; t < k; ++t) N[i][t] = cand[t].second;
91     }
92     return N;
93 }
94 
95 // Solve A x = b using Gaussian elimination with partial pivoting (small k)
96 vector<double> solveLinear(vector<vector<double>> A, vector<double> b) {
97     int n = (int)A.size();
98     for (int col = 0; col < n; ++col) {
99         // pivot
100         int piv = col;
101         for (int r = col+1; r < n; ++r) if (fabs(A[r][col]) > fabs(A[piv][col])) piv = r;
102         if (fabs(A[piv][col]) < 1e-12) { /* singular, small regularization assumed elsewhere */ }
103         if (piv != col) { swap(A[piv], A[col]); swap(b[piv], b[col]); }
104         double diag = A[col][col];
105         for (int j = col; j < n; ++j) A[col][j] /= diag; b[col] /= diag;
106         for (int r = 0; r < n; ++r) if (r != col) {
107             double factor = A[r][col];
108             for (int j = col; j < n; ++j) A[r][j] -= factor * A[col][j];
109             b[r] -= factor * b[col];
110         }
111     }
112     return b; // now the solution
113 }
114 
115 int main(){
116     ios::sync_with_stdio(false);
117     cin.tie(nullptr);
118 
119     int n = 120;    // number of points
120     int D = 3;      // ambient dimension
121     int k = 10;     // neighbors
122     int outDim = 2; // target dimension
123     double reg = 1e-3; // LLE regularization strength
124 
125     // 1) Data
126     auto X = swissRoll(n);
127 
128     // 2) k-NN indices
129     auto Nbrs = knnIndices(X, k);
130 
131     // 3) Compute reconstruction weights W (n x n sparse represented densely here)
132     vector<vector<double>> W(n, vector<double>(n, 0.0));
133 
134     for (int i = 0; i < n; ++i) {
135         // Build Z: D x k neighbor difference matrix
136         vector<vector<double>> Z(D, vector<double>(k, 0.0));
137         for (int t = 0; t < k; ++t) {
138             int j = Nbrs[i][t];
139             for (int d = 0; d < D; ++d)
140                 Z[d][t] = X[j][d] - X[i][d];
141         }
142         // Local covariance C = Z^T Z (k x k)
143         vector<vector<double>> C(k, vector<double>(k, 0.0));
144         for (int a = 0; a < k; ++a)
145             for (int b = 0; b < k; ++b)
146                 for (int d = 0; d < D; ++d)
147                     C[a][b] += Z[d][a] * Z[d][b];
148         // Regularize: C += reg * trace(C) * I
149         double tr = 0.0; for (int a = 0; a < k; ++a) tr += C[a][a];
150         double alpha = reg * (tr > 0 ? tr : 1.0);
151         for (int a = 0; a < k; ++a) C[a][a] += alpha;
152 
153         // Solve C w = 1, then normalize sum(w)=1
154         vector<double> ones(k, 1.0);
155         vector<double> w = solveLinear(C, ones);
156         double sumw = 0.0; for (double v : w) sumw += v;
157         if (fabs(sumw) < 1e-12) sumw = 1.0; // safety
158         for (double &v : w) v /= sumw;
159 
160         // Store in W
161         for (int t = 0; t < k; ++t) W[i][ Nbrs[i][t] ] = w[t];
162     }
163 
164     // 4) Build M = (I - W)^T (I - W)
165     vector<vector<double>> IminusW(n, vector<double>(n, 0.0));
166     for (int i = 0; i < n; ++i) {
167         for (int j = 0; j < n; ++j) IminusW[i][j] = -W[i][j];
168         IminusW[i][i] += 1.0;
169     }
170     vector<vector<double>> M(n, vector<double>(n, 0.0));
171     // M = (I-W)^T (I-W)
172     for (int i = 0; i < n; ++i) {
173         for (int j = 0; j < n; ++j) {
174             double s = 0.0;
175             for (int t = 0; t < n; ++t) s += IminusW[t][i] * IminusW[t][j];
176             M[i][j] = s;
177         }
178     }
179 
180     // 5) Smallest eigenvectors of symmetric M
181     EigenResult er = jacobiEigen(M);
182 
183     // Skip the first eigenvector (approximately constant). Take next outDim.
184     vector<vector<double>> Y(n, vector<double>(outDim, 0.0));
185     for (int comp = 0; comp < outDim; ++comp) {
186         int col = comp + 1; // skip smallest
187         for (int i = 0; i < n; ++i) Y[i][comp] = er.vectors[i][col];
188     }
189 
190     // 6) Output 2D coordinates
191     cout << fixed << setprecision(6);
192     for (int i = 0; i < n; ++i) {
193         cout << Y[i][0] << ", " << Y[i][1] << "\n";
194     }
195     return 0;
196 }
197

This program implements LLE. For each point, it finds k nearest neighbors, computes the local covariance of neighbor differences, regularizes it, and solves for reconstruction weights that sum to one. It then forms M = (I−W)^T(I−W) and computes its smallest nontrivial eigenvectors with a Jacobi solver. The embedding is given by eigenvectors 2..(d+1). The code is dense and cubic for clarity; in practice, exploit W’s sparsity and use iterative eigensolvers for scalability.

Time: O(n^2 D) to compute k-NN by brute force + O(n k^3) to solve local systems + O(n^3) to form and decompose M densely.Space: O(n^2) for W and M in dense form; O(nk) if stored sparsely.

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Simple Jacobi eigenvalue algorithm for symmetric matrices
5	struct EigenResult {
6	vector<double> values; // eigenvalues
7	vector<vector<double>> vectors; // columns are eigenvectors
8	};
9
10	static pair<int,int> maxOffDiag(const vector<vector<double>>& A) {
11	int n = (int)A.size();
12	int p = 0, q = 1;
13	double m = 0.0;
14	for (int i = 0; i < n; ++i) {
15	for (int j = i+1; j < n; ++j) {
16	double v = fabs(A[i][j]);
17	if (v > m) { m = v; p = i; q = j; }
18	}
19	}
20	return {p,q};
21	}
22
23	static EigenResult jacobiEigen(vector<vector<double>> A, int maxIter = 10000, double eps = 1e-10) {
24	int n = (int)A.size();
25	vector<vector<double>> V(n, vector<double>(n, 0.0));
26	for (int i = 0; i < n; ++i) V[i][i] = 1.0; // identity
27
28	for (int iter = 0; iter < maxIter; ++iter) {
29	auto [p, q] = maxOffDiag(A);
30	if (fabs(A[p][q]) < eps) break; // converged
31	double app = A[p][p];
32	double aqq = A[q][q];
33	double apq = A[p][q];
34	double phi = 0.5 * atan2(2.0 * apq, aqq - app);
35	double c = cos(phi), s = sin(phi);
36
37	// Rotate A
38	for (int i = 0; i < n; ++i) {
39	if (i == p \|\| i == q) continue;
40	double aip = A[i][p];
41	double aiq = A[i][q];
42	A[i][p] = c * aip - s * aiq;
43	A[p][i] = A[i][p];
44	A[i][q] = s * aip + c * aiq;
45	A[q][i] = A[i][q];
46	}
47	double new_app = ccapp - 2.0scapq + ss*aqq;
48	double new_aqq = ssapp + 2.0scapq + cc*aqq;
49	A[p][p] = new_app;
50	A[q][q] = new_aqq;
51	A[p][q] = 0.0;
52	A[q][p] = 0.0;
53
54	// Rotate V
55	for (int i = 0; i < n; ++i) {
56	double vip = V[i][p];
57	double viq = V[i][q];
58	V[i][p] = c * vip - s * viq;
59	V[i][q] = s * vip + c * viq;
60	}
61	}
62	vector<double> evals(n);
63	for (int i = 0; i < n; ++i) evals[i] = A[i][i];
64
65	// Sort eigenvalues descending with corresponding eigenvectors
66	vector<int> idx(n);
67	iota(idx.begin(), idx.end(), 0);
68	sort(idx.begin(), idx.end(), [&](int a, int b){ return evals[a] > evals[b]; });
69	vector<double> sortedVals(n);
70	vector<vector<double>> sortedVecs(n, vector<double>(n));
71	for (int k = 0; k < n; ++k) {
72	sortedVals[k] = evals[idx[k]];
73	for (int i = 0; i < n; ++i) sortedVecs[i][k] = V[i][idx[k]];
74	}
75	return {sortedVals, sortedVecs};
76	}
77
78	static double sqr(double x){ return x*x; }
79
80	// Generate Swiss roll (n points)
81	vector<vector<double>> swissRoll(int n, unsigned seed=42) {
82	mt19937 rng(seed);
83	uniform_real_distribution<double> U(1.5M_PI, 4.5M_PI);
84	uniform_real_distribution<double> H(0.0, 10.0);
85	vector<vector<double>> X(n, vector<double>(3));
86	for (int i = 0; i < n; ++i) {
87	double t = U(rng);
88	double h = H(rng);
89	double x = t * cos(t);
90	double y = h;
91	double z = t * sin(t);
92	X[i][0] = x; X[i][1] = y; X[i][2] = z;
93	}
94	return X;
95	}
96
97	// Euclidean distance between rows i and j
98	static double dist(const vector<vector<double>>& X, int i, int j) {
99	double s = 0.0; int D = (int)X[0].size();
100	for (int d = 0; d < D; ++d) s += sqr(X[i][d] - X[j][d]);
101	return sqrt(s);
102	}
103
104	// Build k-NN graph (symmetric) with Euclidean weights
105	vector<vector<pair<int,double>>> buildKNNGraph(const vector<vector<double>>& X, int k) {
106	int n = (int)X.size();
107	vector<vector<pair<int,double>>> adj(n);
108	for (int i = 0; i < n; ++i) {
109	vector<pair<double,int>> cand;
110	cand.reserve(n-1);
111	for (int j = 0; j < n; ++j) if (i != j) cand.push_back({dist(X,i,j), j});
112	nth_element(cand.begin(), cand.begin()+k, cand.end());
113	sort(cand.begin(), cand.begin()+k); // ensure deterministic neighbors
114	for (int t = 0; t < k; ++t) {
115	int j = cand[t].second;
116	double w = cand[t].first;
117	adj[i].push_back({j, w});
118	adj[j].push_back({i, w}); // make symmetric
119	}
120	}
121	// Optional: deduplicate symmetric edges
122	for (int i = 0; i < n; ++i) {
123	sort(adj[i].begin(), adj[i].end());
124	adj[i].erase(unique(adj[i].begin(), adj[i].end(), [](auto&a, auto&b){return a.first==b.first;}), adj[i].end());
125	}
126	return adj;
127	}
128
129	// Dijkstra from a single source
130	vector<double> dijkstra(const vector<vector<pair<int,double>>>& adj, int s) {
131	int n = (int)adj.size();
132	const double INF = 1e100;
133	vector<double> dist(n, INF);
134	dist[s] = 0.0;
135	using P = pair<double,int>;
136	priority_queue<P, vector<P>, greater<P>> pq;
137	pq.push({0.0, s});
138	while (!pq.empty()) {
139	auto [d,u] = pq.top(); pq.pop();
140	if (d > dist[u]) continue;
141	for (auto [v,w] : adj[u]) {
142	if (dist[v] > d + w) {
143	dist[v] = d + w;
144	pq.push({dist[v], v});
145	}
146	}
147	}
148	return dist;
149	}
150
151	// Classical MDS from distance matrix to d dimensions
152	vector<vector<double>> classicalMDS(const vector<vector<double>>& D, int d) {
153	int n = (int)D.size();
154	// Build squared distance matrix
155	vector<vector<double>> S(n, vector<double>(n));
156	for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) S[i][j] = sqr(D[i][j]);
157
158	// Double centering: B = -0.5 * J S J
159	vector<double> rowMean(n, 0.0), colMean(n, 0.0);
160	double totalMean = 0.0;
161	for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) rowMean[i] += S[i][j];
162	for (int i = 0; i < n; ++i) rowMean[i] /= n;
163	for (int j = 0; j < n; ++j) {
164	for (int i = 0; i < n; ++i) colMean[j] += S[i][j];
165	colMean[j] /= n;
166	}
167	for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) totalMean += S[i][j];
168	totalMean /= (n * n);
169
170	vector<vector<double>> B(n, vector<double>(n));
171	for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j)
172	B[i][j] = -0.5 * ( S[i][j] - rowMean[i] - colMean[j] + totalMean );
173
174	// Eigen-decomposition of symmetric B
175	EigenResult er = jacobiEigen(B);
176
177	// Take top d positive eigenvalues/eigenvectors
178	vector<vector<double>> Y(n, vector<double>(d, 0.0));
179	int taken = 0;
180	for (int k = 0; k < (int)er.values.size() && taken < d; ++k) {
181	double lambda = er.values[k];
182	if (lambda <= 0) continue; // skip nonpositive
183	double sroot = sqrt(lambda);
184	for (int i = 0; i < n; ++i) Y[i][taken] = er.vectors[i][k] * sroot;
185	++taken;
186	}
187	// If not enough positive eigenvalues, remaining columns stay zero
188	return Y;
189	}
190
191	int main(){
192	ios::sync_with_stdio(false);
193	cin.tie(nullptr);
194
195	int n = 120; // number of points
196	int k = 8; // neighbors
197	int outDim = 2; // target dimension
198
199	// 1) Data
200	auto X = swissRoll(n);
201
202	// 2) Build k-NN graph
203	auto adj = buildKNNGraph(X, k);
204
205	// 3) All-pairs shortest paths (geodesic distances)
206	vector<vector<double>> G(n, vector<double>(n, 0.0));
207	for (int i = 0; i < n; ++i) {
208	auto di = dijkstra(adj, i);
209	G[i] = di;
210	}
211
212	// 4) Classical MDS to obtain embedding
213	auto Y = classicalMDS(G, outDim);
214
215	// 5) Output 2D coordinates
216	cout << fixed << setprecision(6);
217	for (int i = 0; i < n; ++i) {
218	cout << Y[i][0] << ", " << Y[i][1] << "\n";
219	}
220	return 0;
221	}
222