🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
📚TheoryAdvanced

Manifold Learning

Key Points

  • •
    Manifold learning assumes high-dimensional data actually lies near a much lower-dimensional, smoothly curved surface embedded in a higher-dimensional space.
  • •
    Algorithms like Isomap, LLE, and Laplacian Eigenmaps recover low-dimensional coordinates by preserving geodesic or local neighborhood relationships.
  • •
    A k-nearest neighbor graph approximates the manifold’s local geometry; shortest paths on this graph approximate geodesic distances.
  • •
    Classical MDS on graph geodesic distances yields Isomap embeddings that preserve global geometry for well-sampled manifolds.
  • •
    LLE computes reconstruction weights of each point from its neighbors and then finds low-dimensional coordinates that preserve those weights.
  • •
    All these methods rely on spectral decomposition (eigenvectors of symmetric matrices), which is computationally heavy for large datasets.
  • •
    Choosing k (neighbors) is critical: too small breaks connectivity; too large flattens curvature and violates the manifold assumption.
  • •
    In C++, you can prototype manifold learning with k-NN graphs, Dijkstra’s algorithm, and a Jacobi eigen-solver for small datasets.

Prerequisites

  • →Linear algebra (eigenvalues, eigenvectors, symmetric matrices) — Manifold methods reduce to eigendecomposition of symmetric matrices and manipulate Gram and Laplacian matrices.
  • →Graph algorithms (k-NN, Dijkstra’s shortest paths) — Neighborhood graphs approximate manifold geometry and geodesic distances via shortest paths.
  • →Multivariate calculus and vector norms — Understanding distances, gradients, and smooth manifolds benefits from calculus and norm properties.
  • →Numerical computing and stability — Solving ill-conditioned systems and handling regularization is crucial for LLE and spectral steps.
  • →Probability and noise models — Helps reason about sampling density, noise effects, and robustness of local approximations.

Detailed Explanation

Tap terms for definitions

01Overview

Manifold learning is a family of techniques for finding low-dimensional structure hidden in high-dimensional data. The key idea is that while each data point may have hundreds or thousands of features, the actual degrees of freedom that generate the data are often much fewer. For example, images of a rotating object live in a huge pixel space, but the rotation can be described by a handful of parameters. Manifold learning methods try to “unfold” the data so that points that are close on the underlying curved surface (the manifold) are also close in the new low-dimensional coordinates.

Several popular algorithms operationalize this idea differently. Isomap builds a neighborhood graph, estimates geodesic (on-manifold) distances using shortest paths, and then applies classical multidimensional scaling (MDS) to preserve those distances globally. Locally Linear Embedding (LLE) preserves local linear relationships: each point is approximated as a weighted combination of its neighbors, and those weights are preserved in the embedding. Laplacian Eigenmaps preserves local adjacency by minimizing differences between neighbors in the embedding, guided by the graph Laplacian.

These methods are especially useful when Euclidean distances in the original space do not reflect true similarity due to curvature. They are unsupervised and rely heavily on linear algebra and spectral graph theory. Although powerful, they can be computationally expensive and sensitive to hyperparameters like the number of neighbors k and the choice of distance metric.

02Intuition & Analogies

Imagine walking around a hilly park (the manifold). If you measure your location by GPS coordinates (latitude, longitude, and altitude), that’s a 3D description. But your possible movements are constrained to the 2D surface of the park—you can’t tunnel straight through the hill. Your true degrees of freedom are just two: forward/backward and left/right along the ground. Manifold learning tries to recover this 2D map from many observed 3D GPS points.

Another analogy: a rolled-up poster (Swiss roll). The paper is intrinsically 2D, but when rolled it sits in 3D space. Two ants close on the paper might be far apart in 3D if the roll places them on different layers. If we measured straight-line (Euclidean) 3D distance, we’d think they’re far. But the ants can only walk along the paper, so the true distance is the shortest path along the surface (a geodesic). Isomap approximates this by connecting nearby points (ants that can step to each other) and using shortest paths on this graph as the “walkable” distance. After it estimates those distances, it flattens the roll by arranging the points in 2D so the walkable distances are preserved.

LLE uses a different local story. Picture each ant surrounded by a small group of friends. In the original rolled space, each ant can be approximated as a weighted average of its nearby friends because, in a tiny neighborhood, the surface looks almost flat (like a tangent plane). LLE finds these weights first. Then it looks for a low-dimensional arrangement where these same “friendship weights” still reconstruct each ant well. If that works, the local geometry is preserved, and the global shape emerges from stitching all the local neighborhoods together.

03Formal Definition

Let M be a smooth, compact, d-dimensional Riemannian manifold embedded in \(RD\) with \(d ≪ D\). We observe samples \(X = \{xi​\}_{i=1}^n ⊂ RD\) assumed to lie on or near M, possibly with noise. The goal is to find an embedding \(f: M → Rd\) (or its discrete approximation on X) such that intrinsic geometric relationships are preserved. - Isomap: Build a neighborhood graph G over X (e.g., k-NN). Define graph edge weights as Euclidean distances. Approximate geodesic distances \(dM​(xi​,xj​)\) by shortest-path distances \(dG​(i,j)\). Apply classical MDS to the matrix of squared geodesic distances to obtain coordinates \(Y ∈ Rn×d\) that preserve these distances in a least-squares sense. - LLE: For each \(xi​\), find neighbors \(N(i)\). Compute weights \(wij​\) that reconstruct \(xi​\) from its neighbors by minimizing \(\left\|x_i - \sum_{j \in \mathcal{N}(i)} w_{ij} x_j\right\|_2^2\) subject to \(∑j​ w_{ij} = 1\). Then find \(Y\) minimizing \(∑i​ \left\|y_i - \sum_j w_{ij} y_j\right\|_2^2\) under constraints like \(Y^⊤ Y = I\) and \(Y^⊤ 1 = 0\). This reduces to an eigenproblem for \(M = (I-W)^⊤(I-W)\). - Laplacian Eigenmaps: Build G with weights \(wij​\) (e.g., heat kernel). Minimize \(∑i,j​ wij​ ∥ yi​ - yj​ ∥2 = Y^⊤ L Y\) subject to \(Y^⊤ D Y = I\), where \(L = D - W\) is the graph Laplacian and D is the degree matrix.

04When to Use

Use manifold learning when your data is high dimensional but is likely governed by a few underlying parameters, and when Euclidean distances in the ambient space are misleading. Classic scenarios include: images of a single object under varying pose/lighting (few factors), sensor data from a robot moving in a constrained environment, speech snippets parameterized by a small number of articulatory variables, and scientific simulations exploring a limited parameter space.

Choose Isomap if you believe global, geodesic distances are meaningful and your manifold is well-sampled and mostly convex without many short-circuit edges; it captures global geometry well but can be sensitive to graph connectivity. Choose LLE when you want to preserve local linear structure and the manifold is smooth; it often performs well on data with strong local linearity and does not require computing all-pairs shortest paths. Choose Laplacian Eigenmaps when locality and smoothness are more important than global distances, especially as a preprocessing step for clustering or semi-supervised learning.

Avoid manifold learning if your data is highly noisy, has many disconnected regions, or if you need faithful extrapolation beyond the sampled manifold (these methods are generally non-parametric and do not extrapolate). For massive datasets, consider approximate neighbors, sparse solvers, or parametric variants (e.g., deep autoencoders) due to the spectral step’s cubic cost.

⚠️Common Mistakes

  • Picking k (neighbors) too small: The graph becomes disconnected, making geodesic distances infinite or embeddings fragmented. Always check connectivity and consider increasing k until the largest component includes most points.
  • Picking k too large: The graph adds “shortcuts” that jump across folds; geodesic distances collapse to Euclidean ones, destroying the manifold structure. Use domain knowledge and try a range of k while monitoring reconstruction error and visualization.
  • Ignoring scaling: Features with different scales distort distances and neighborhoods. Standardize features (zero-mean, unit-variance) or use a domain-appropriate metric.
  • Not regularizing LLE: Local covariance matrices can be ill-conditioned when k is near D or neighbors are nearly collinear. Add a small multiple of the trace to the diagonal to stabilize the weight computation.
  • Misinterpreting eigenvectors: In LLE and Laplacian Eigenmaps, the smallest eigenvector is often trivial (constant); skip it and take the next d eigenvectors. In MDS/Isomap, take the largest positive eigenvalues.
  • Using too few samples: Sparse sampling breaks local linear assumptions and graph approximations to geodesics. Ensure adequate density or use adaptive neighborhoods.
  • Expecting extrapolation: Non-parametric embeddings do not generalize to unseen points without an out-of-sample extension (e.g., Nystrom method or training a regressor on the embedding).

Key Formulas

Manifold Assumption

M⊂RD,dim(M)=d≪D

Explanation: Data lies on or near a smooth, low-dimensional manifold embedded in a higher-dimensional ambient space. The intrinsic dimension d is much smaller than the ambient dimension D.

Noisy Samples

xi​=mi​+εi​,mi​∈M, εi​∼noise

Explanation: Observed points are near the manifold points plus small noise. Many algorithms assume the noise is small so local geometry is preserved.

Geodesic Distance

dM​(p,q)=γ:[0,1]→M, γ(0)=p,γ(1)=qinf​∫01​∥γ′(t)∥dt

Explanation: The manifold distance between two points is the length of the shortest path constrained to lie on the manifold. It reduces to straight-line distance on flat manifolds.

Graph Shortest Path

dG​(i,j)=paths i→jmin​(u,v)∈path∑​wuv​

Explanation: Approximates geodesic distances using a neighborhood graph with edge weights given by local Euclidean distances. Dijkstra’s algorithm computes these efficiently for nonnegative weights.

MDS Double-Centering

J=I−n1​11⊤,B=−21​JD(2)J

Explanation: Given squared distances D(2), this recovers the centered inner-product (Gram) matrix B. Eigenvectors of B scaled by the square roots of top eigenvalues give Euclidean coordinates.

LLE Weights

{wij​}min​i=1∑n​​xi​−j∈N(i)∑​wij​xj​​22​s.t. j∑​wij​=1

Explanation: Each point is reconstructed from its neighbors with weights that sum to one. The solution uses the local covariance of neighbors with a small regularizer for stability.

LLE Embedding Matrix

M=(I−W)⊤(I−W)

Explanation: After fixing reconstruction weights W, the embedding Y minimizing reconstruction error is given by the bottom nontrivial eigenvectors of M. The smallest eigenvector is constant and is discarded.

Laplacian Eigenmaps Objective

Ymin​i,j∑​wij​∥yi​−yj​∥2=tr(Y⊤LY)s.t. Y⊤DY=I, Y⊤D1=0

Explanation: Encourages nearby points to stay close in the embedding via the graph Laplacian L = D - W. The solution uses the smallest nontrivial generalized eigenvectors of (L, D).

Isomap Complexity (rough)

TimeIsomap​≈O(n2D)+O(n2logn)+O(n3)

Explanation: Building k-NN by brute force is O(n2 D), all-pairs shortest paths via repeated Dijkstra is about O(n2 log n + n2 k), and the spectral step (MDS) for n points is O(n3).

LLE Complexity (rough)

TimeLLE​≈O(n2D)+O(nk3)+O(n3)

Explanation: k-NN search is O(n2 D) by brute force, local weight solves are O(k3) per point, and the eigen-decomposition of the dense n×n matrix is O(n3).

Spectral Gap Heuristic

d^=argdmax​(λd​−λd+1​)

Explanation: Choose the embedding dimension d at a large drop in the eigenvalue spectrum. A prominent gap suggests the intrinsic dimensionality.

Complexity Analysis

Manifold learning pipelines consist of three dominant stages: neighborhood construction, a geometric/spectral core, and an eigendecomposition. With naive (brute-force) k-NN, computing all pairwise distances costs O(n2 D) time and O(n2) space to store them. For moderate D and up to a few thousand points, this is practical; beyond that, approximate neighbors (e.g., KD-trees, ball trees, or ANN libraries) reduce the cost but complicate implementation. For Isomap, after building the k-NN graph with roughly m ≈ nk edges, we approximate geodesic distances by running Dijkstra from each node. Using a binary heap, one source is O(m log n) ≈ O(nk log n); repeating for all sources is O(n2 k log n). Storing the distance matrix is O(n2). Next, classical MDS requires building the centered Gram matrix and performing an eigendecomposition of an n×n symmetric matrix. Dense eigensolvers are O(n3) time and O(n2) space; Jacobi’s method used in small C++ demos is simple but also cubic and slower than optimized libraries. For LLE, the local weight computation per point solves a k×k linear system, costing O(k3) per point for Gaussian elimination and O(k2) space. Across all points this is O(n k3). Constructing M = (I − W)^T(I − W) densely is O(n3) if done naively; exploiting sparsity reduces this to approximately O(n k2). The final spectral step again dominates at O(n3) with O(n2) memory. In summary, for exact, naive implementations: Isomap ≈ O(n2 D + n2 k log n + n3) time and O(n2) space; LLE ≈ O(n2 D + n k3 + n3) time and O(n2) space. Thus, these methods scale poorly with n; use sparse matrices, iterative eigensolvers (Lanczos), and approximate neighbors for large datasets.

Code Examples

Isomap: k-NN graph, Dijkstra geodesics, and classical MDS (Jacobi eigen) on a Swiss roll
1#include <bits/stdc++.h>
2using namespace std;
3
4// Simple Jacobi eigenvalue algorithm for symmetric matrices
5struct EigenResult {
6 vector<double> values; // eigenvalues
7 vector<vector<double>> vectors; // columns are eigenvectors
8};
9
10static pair<int,int> maxOffDiag(const vector<vector<double>>& A) {
11 int n = (int)A.size();
12 int p = 0, q = 1;
13 double m = 0.0;
14 for (int i = 0; i < n; ++i) {
15 for (int j = i+1; j < n; ++j) {
16 double v = fabs(A[i][j]);
17 if (v > m) { m = v; p = i; q = j; }
18 }
19 }
20 return {p,q};
21}
22
23static EigenResult jacobiEigen(vector<vector<double>> A, int maxIter = 10000, double eps = 1e-10) {
24 int n = (int)A.size();
25 vector<vector<double>> V(n, vector<double>(n, 0.0));
26 for (int i = 0; i < n; ++i) V[i][i] = 1.0; // identity
27
28 for (int iter = 0; iter < maxIter; ++iter) {
29 auto [p, q] = maxOffDiag(A);
30 if (fabs(A[p][q]) < eps) break; // converged
31 double app = A[p][p];
32 double aqq = A[q][q];
33 double apq = A[p][q];
34 double phi = 0.5 * atan2(2.0 * apq, aqq - app);
35 double c = cos(phi), s = sin(phi);
36
37 // Rotate A
38 for (int i = 0; i < n; ++i) {
39 if (i == p || i == q) continue;
40 double aip = A[i][p];
41 double aiq = A[i][q];
42 A[i][p] = c * aip - s * aiq;
43 A[p][i] = A[i][p];
44 A[i][q] = s * aip + c * aiq;
45 A[q][i] = A[i][q];
46 }
47 double new_app = c*c*app - 2.0*s*c*apq + s*s*aqq;
48 double new_aqq = s*s*app + 2.0*s*c*apq + c*c*aqq;
49 A[p][p] = new_app;
50 A[q][q] = new_aqq;
51 A[p][q] = 0.0;
52 A[q][p] = 0.0;
53
54 // Rotate V
55 for (int i = 0; i < n; ++i) {
56 double vip = V[i][p];
57 double viq = V[i][q];
58 V[i][p] = c * vip - s * viq;
59 V[i][q] = s * vip + c * viq;
60 }
61 }
62 vector<double> evals(n);
63 for (int i = 0; i < n; ++i) evals[i] = A[i][i];
64
65 // Sort eigenvalues descending with corresponding eigenvectors
66 vector<int> idx(n);
67 iota(idx.begin(), idx.end(), 0);
68 sort(idx.begin(), idx.end(), [&](int a, int b){ return evals[a] > evals[b]; });
69 vector<double> sortedVals(n);
70 vector<vector<double>> sortedVecs(n, vector<double>(n));
71 for (int k = 0; k < n; ++k) {
72 sortedVals[k] = evals[idx[k]];
73 for (int i = 0; i < n; ++i) sortedVecs[i][k] = V[i][idx[k]];
74 }
75 return {sortedVals, sortedVecs};
76}
77
78static double sqr(double x){ return x*x; }
79
80// Generate Swiss roll (n points)
81vector<vector<double>> swissRoll(int n, unsigned seed=42) {
82 mt19937 rng(seed);
83 uniform_real_distribution<double> U(1.5*M_PI, 4.5*M_PI);
84 uniform_real_distribution<double> H(0.0, 10.0);
85 vector<vector<double>> X(n, vector<double>(3));
86 for (int i = 0; i < n; ++i) {
87 double t = U(rng);
88 double h = H(rng);
89 double x = t * cos(t);
90 double y = h;
91 double z = t * sin(t);
92 X[i][0] = x; X[i][1] = y; X[i][2] = z;
93 }
94 return X;
95}
96
97// Euclidean distance between rows i and j
98static double dist(const vector<vector<double>>& X, int i, int j) {
99 double s = 0.0; int D = (int)X[0].size();
100 for (int d = 0; d < D; ++d) s += sqr(X[i][d] - X[j][d]);
101 return sqrt(s);
102}
103
104// Build k-NN graph (symmetric) with Euclidean weights
105vector<vector<pair<int,double>>> buildKNNGraph(const vector<vector<double>>& X, int k) {
106 int n = (int)X.size();
107 vector<vector<pair<int,double>>> adj(n);
108 for (int i = 0; i < n; ++i) {
109 vector<pair<double,int>> cand;
110 cand.reserve(n-1);
111 for (int j = 0; j < n; ++j) if (i != j) cand.push_back({dist(X,i,j), j});
112 nth_element(cand.begin(), cand.begin()+k, cand.end());
113 sort(cand.begin(), cand.begin()+k); // ensure deterministic neighbors
114 for (int t = 0; t < k; ++t) {
115 int j = cand[t].second;
116 double w = cand[t].first;
117 adj[i].push_back({j, w});
118 adj[j].push_back({i, w}); // make symmetric
119 }
120 }
121 // Optional: deduplicate symmetric edges
122 for (int i = 0; i < n; ++i) {
123 sort(adj[i].begin(), adj[i].end());
124 adj[i].erase(unique(adj[i].begin(), adj[i].end(), [](auto&a, auto&b){return a.first==b.first;}), adj[i].end());
125 }
126 return adj;
127}
128
129// Dijkstra from a single source
130vector<double> dijkstra(const vector<vector<pair<int,double>>>& adj, int s) {
131 int n = (int)adj.size();
132 const double INF = 1e100;
133 vector<double> dist(n, INF);
134 dist[s] = 0.0;
135 using P = pair<double,int>;
136 priority_queue<P, vector<P>, greater<P>> pq;
137 pq.push({0.0, s});
138 while (!pq.empty()) {
139 auto [d,u] = pq.top(); pq.pop();
140 if (d > dist[u]) continue;
141 for (auto [v,w] : adj[u]) {
142 if (dist[v] > d + w) {
143 dist[v] = d + w;
144 pq.push({dist[v], v});
145 }
146 }
147 }
148 return dist;
149}
150
151// Classical MDS from distance matrix to d dimensions
152vector<vector<double>> classicalMDS(const vector<vector<double>>& D, int d) {
153 int n = (int)D.size();
154 // Build squared distance matrix
155 vector<vector<double>> S(n, vector<double>(n));
156 for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) S[i][j] = sqr(D[i][j]);
157
158 // Double centering: B = -0.5 * J S J
159 vector<double> rowMean(n, 0.0), colMean(n, 0.0);
160 double totalMean = 0.0;
161 for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) rowMean[i] += S[i][j];
162 for (int i = 0; i < n; ++i) rowMean[i] /= n;
163 for (int j = 0; j < n; ++j) {
164 for (int i = 0; i < n; ++i) colMean[j] += S[i][j];
165 colMean[j] /= n;
166 }
167 for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) totalMean += S[i][j];
168 totalMean /= (n * n);
169
170 vector<vector<double>> B(n, vector<double>(n));
171 for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j)
172 B[i][j] = -0.5 * ( S[i][j] - rowMean[i] - colMean[j] + totalMean );
173
174 // Eigen-decomposition of symmetric B
175 EigenResult er = jacobiEigen(B);
176
177 // Take top d positive eigenvalues/eigenvectors
178 vector<vector<double>> Y(n, vector<double>(d, 0.0));
179 int taken = 0;
180 for (int k = 0; k < (int)er.values.size() && taken < d; ++k) {
181 double lambda = er.values[k];
182 if (lambda <= 0) continue; // skip nonpositive
183 double sroot = sqrt(lambda);
184 for (int i = 0; i < n; ++i) Y[i][taken] = er.vectors[i][k] * sroot;
185 ++taken;
186 }
187 // If not enough positive eigenvalues, remaining columns stay zero
188 return Y;
189}
190
191int main(){
192 ios::sync_with_stdio(false);
193 cin.tie(nullptr);
194
195 int n = 120; // number of points
196 int k = 8; // neighbors
197 int outDim = 2; // target dimension
198
199 // 1) Data
200 auto X = swissRoll(n);
201
202 // 2) Build k-NN graph
203 auto adj = buildKNNGraph(X, k);
204
205 // 3) All-pairs shortest paths (geodesic distances)
206 vector<vector<double>> G(n, vector<double>(n, 0.0));
207 for (int i = 0; i < n; ++i) {
208 auto di = dijkstra(adj, i);
209 G[i] = di;
210 }
211
212 // 4) Classical MDS to obtain embedding
213 auto Y = classicalMDS(G, outDim);
214
215 // 5) Output 2D coordinates
216 cout << fixed << setprecision(6);
217 for (int i = 0; i < n; ++i) {
218 cout << Y[i][0] << ", " << Y[i][1] << "\n";
219 }
220 return 0;
221}
222

This program demonstrates Isomap. It generates a 3D Swiss roll, builds a symmetric k-nearest neighbor graph with Euclidean edge weights, and computes approximate geodesic distances between all pairs by running Dijkstra from each node. It then applies classical MDS by double-centering the squared geodesic distances to produce a Gram matrix, performs a Jacobi eigendecomposition, and takes the top d eigenvectors scaled by the square roots of their eigenvalues as the 2D embedding. The Jacobi eigen-solver is simple and adequate for small n, making the example self-contained without external libraries.

Time: O(n^2 D) for brute-force k-NN + O(n^2 k log n) for all-pairs Dijkstra + O(n^3) for the eigendecomposition (dominant for larger n).Space: O(n^2) to store the distance matrix and Gram matrix; O(nk) for the graph.
Locally Linear Embedding (LLE): weights, (I−W)^T(I−W), and bottom eigenvectors (Jacobi)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct EigenResult {
5 vector<double> values; // eigenvalues
6 vector<vector<double>> vectors; // columns are eigenvectors
7};
8
9static pair<int,int> maxOffDiag(const vector<vector<double>>& A) {
10 int n = (int)A.size();
11 int p = 0, q = 1; double m = 0.0;
12 for (int i = 0; i < n; ++i) for (int j = i+1; j < n; ++j) {
13 double v = fabs(A[i][j]);
14 if (v > m) { m = v; p = i; q = j; }
15 }
16 return {p,q};
17}
18
19static EigenResult jacobiEigen(vector<vector<double>> A, int maxIter = 10000, double eps = 1e-10) {
20 int n = (int)A.size();
21 vector<vector<double>> V(n, vector<double>(n, 0.0));
22 for (int i = 0; i < n; ++i) V[i][i] = 1.0;
23 for (int iter = 0; iter < maxIter; ++iter) {
24 auto [p, q] = maxOffDiag(A);
25 if (fabs(A[p][q]) < eps) break;
26 double app = A[p][p], aqq = A[q][q], apq = A[p][q];
27 double phi = 0.5 * atan2(2.0*apq, aqq - app);
28 double c = cos(phi), s = sin(phi);
29 for (int i = 0; i < n; ++i) if (i != p && i != q) {
30 double aip = A[i][p], aiq = A[i][q];
31 A[i][p] = c*aip - s*aiq; A[p][i] = A[i][p];
32 A[i][q] = s*aip + c*aiq; A[q][i] = A[i][q];
33 }
34 double new_app = c*c*app - 2*s*c*apq + s*s*aqq;
35 double new_aqq = s*s*app + 2*s*c*apq + c*c*aqq;
36 A[p][p] = new_app; A[q][q] = new_aqq;
37 A[p][q] = 0.0; A[q][p] = 0.0;
38 for (int i = 0; i < n; ++i) {
39 double vip = V[i][p], viq = V[i][q];
40 V[i][p] = c*vip - s*viq; V[i][q] = s*vip + c*viq;
41 }
42 }
43 vector<double> evals(n);
44 for (int i = 0; i < n; ++i) evals[i] = A[i][i];
45 // Sort ascending (for LLE we need smallest eigenvalues)
46 vector<int> idx(n); iota(idx.begin(), idx.end(), 0);
47 sort(idx.begin(), idx.end(), [&](int a, int b){ return evals[a] < evals[b]; });
48 vector<double> sortedVals(n);
49 vector<vector<double>> sortedVecs(n, vector<double>(n));
50 for (int k = 0; k < n; ++k) {
51 sortedVals[k] = evals[idx[k]];
52 for (int i = 0; i < n; ++i) sortedVecs[i][k] = V[i][idx[k]];
53 }
54 return {sortedVals, sortedVecs};
55}
56
57static double sqr(double x){ return x*x; }
58
59// Generate Swiss roll
60vector<vector<double>> swissRoll(int n, unsigned seed=1337) {
61 mt19937 rng(seed);
62 uniform_real_distribution<double> U(1.5*M_PI, 4.5*M_PI);
63 uniform_real_distribution<double> H(0.0, 10.0);
64 vector<vector<double>> X(n, vector<double>(3));
65 for (int i = 0; i < n; ++i) {
66 double t = U(rng);
67 double h = H(rng);
68 X[i][0] = t * cos(t);
69 X[i][1] = h;
70 X[i][2] = t * sin(t);
71 }
72 return X;
73}
74
75static double euclid(const vector<vector<double>>& X, int i, int j) {
76 double s = 0.0; int D = (int)X[0].size();
77 for (int d = 0; d < D; ++d) s += sqr(X[i][d] - X[j][d]);
78 return sqrt(s);
79}
80
81vector<vector<int>> knnIndices(const vector<vector<double>>& X, int k) {
82 int n = (int)X.size();
83 vector<vector<int>> N(n);
84 for (int i = 0; i < n; ++i) {
85 vector<pair<double,int>> cand; cand.reserve(n-1);
86 for (int j = 0; j < n; ++j) if (i != j) cand.push_back({euclid(X,i,j), j});
87 nth_element(cand.begin(), cand.begin()+k, cand.end());
88 sort(cand.begin(), cand.begin()+k);
89 N[i].resize(k);
90 for (int t = 0; t < k; ++t) N[i][t] = cand[t].second;
91 }
92 return N;
93}
94
95// Solve A x = b using Gaussian elimination with partial pivoting (small k)
96vector<double> solveLinear(vector<vector<double>> A, vector<double> b) {
97 int n = (int)A.size();
98 for (int col = 0; col < n; ++col) {
99 // pivot
100 int piv = col;
101 for (int r = col+1; r < n; ++r) if (fabs(A[r][col]) > fabs(A[piv][col])) piv = r;
102 if (fabs(A[piv][col]) < 1e-12) { /* singular, small regularization assumed elsewhere */ }
103 if (piv != col) { swap(A[piv], A[col]); swap(b[piv], b[col]); }
104 double diag = A[col][col];
105 for (int j = col; j < n; ++j) A[col][j] /= diag; b[col] /= diag;
106 for (int r = 0; r < n; ++r) if (r != col) {
107 double factor = A[r][col];
108 for (int j = col; j < n; ++j) A[r][j] -= factor * A[col][j];
109 b[r] -= factor * b[col];
110 }
111 }
112 return b; // now the solution
113}
114
115int main(){
116 ios::sync_with_stdio(false);
117 cin.tie(nullptr);
118
119 int n = 120; // number of points
120 int D = 3; // ambient dimension
121 int k = 10; // neighbors
122 int outDim = 2; // target dimension
123 double reg = 1e-3; // LLE regularization strength
124
125 // 1) Data
126 auto X = swissRoll(n);
127
128 // 2) k-NN indices
129 auto Nbrs = knnIndices(X, k);
130
131 // 3) Compute reconstruction weights W (n x n sparse represented densely here)
132 vector<vector<double>> W(n, vector<double>(n, 0.0));
133
134 for (int i = 0; i < n; ++i) {
135 // Build Z: D x k neighbor difference matrix
136 vector<vector<double>> Z(D, vector<double>(k, 0.0));
137 for (int t = 0; t < k; ++t) {
138 int j = Nbrs[i][t];
139 for (int d = 0; d < D; ++d)
140 Z[d][t] = X[j][d] - X[i][d];
141 }
142 // Local covariance C = Z^T Z (k x k)
143 vector<vector<double>> C(k, vector<double>(k, 0.0));
144 for (int a = 0; a < k; ++a)
145 for (int b = 0; b < k; ++b)
146 for (int d = 0; d < D; ++d)
147 C[a][b] += Z[d][a] * Z[d][b];
148 // Regularize: C += reg * trace(C) * I
149 double tr = 0.0; for (int a = 0; a < k; ++a) tr += C[a][a];
150 double alpha = reg * (tr > 0 ? tr : 1.0);
151 for (int a = 0; a < k; ++a) C[a][a] += alpha;
152
153 // Solve C w = 1, then normalize sum(w)=1
154 vector<double> ones(k, 1.0);
155 vector<double> w = solveLinear(C, ones);
156 double sumw = 0.0; for (double v : w) sumw += v;
157 if (fabs(sumw) < 1e-12) sumw = 1.0; // safety
158 for (double &v : w) v /= sumw;
159
160 // Store in W
161 for (int t = 0; t < k; ++t) W[i][ Nbrs[i][t] ] = w[t];
162 }
163
164 // 4) Build M = (I - W)^T (I - W)
165 vector<vector<double>> IminusW(n, vector<double>(n, 0.0));
166 for (int i = 0; i < n; ++i) {
167 for (int j = 0; j < n; ++j) IminusW[i][j] = -W[i][j];
168 IminusW[i][i] += 1.0;
169 }
170 vector<vector<double>> M(n, vector<double>(n, 0.0));
171 // M = (I-W)^T (I-W)
172 for (int i = 0; i < n; ++i) {
173 for (int j = 0; j < n; ++j) {
174 double s = 0.0;
175 for (int t = 0; t < n; ++t) s += IminusW[t][i] * IminusW[t][j];
176 M[i][j] = s;
177 }
178 }
179
180 // 5) Smallest eigenvectors of symmetric M
181 EigenResult er = jacobiEigen(M);
182
183 // Skip the first eigenvector (approximately constant). Take next outDim.
184 vector<vector<double>> Y(n, vector<double>(outDim, 0.0));
185 for (int comp = 0; comp < outDim; ++comp) {
186 int col = comp + 1; // skip smallest
187 for (int i = 0; i < n; ++i) Y[i][comp] = er.vectors[i][col];
188 }
189
190 // 6) Output 2D coordinates
191 cout << fixed << setprecision(6);
192 for (int i = 0; i < n; ++i) {
193 cout << Y[i][0] << ", " << Y[i][1] << "\n";
194 }
195 return 0;
196}
197

This program implements LLE. For each point, it finds k nearest neighbors, computes the local covariance of neighbor differences, regularizes it, and solves for reconstruction weights that sum to one. It then forms M = (I−W)^T(I−W) and computes its smallest nontrivial eigenvectors with a Jacobi solver. The embedding is given by eigenvectors 2..(d+1). The code is dense and cubic for clarity; in practice, exploit W’s sparsity and use iterative eigensolvers for scalability.

Time: O(n^2 D) to compute k-NN by brute force + O(n k^3) to solve local systems + O(n^3) to form and decompose M densely.Space: O(n^2) for W and M in dense form; O(nk) if stored sparsely.
#manifold learning#isomap#locally linear embedding#laplacian eigenmaps#k-nn graph#geodesic distance#classical mds#spectral embedding#intrinsic dimension#swiss roll#jacobi eigensolver#shortest path#graph laplacian#double-centering#unsupervised learning