Gauge Equivariant Networks
Key Points
- ā¢Gauge equivariant networks are neural networks that respect local symmetries (gauges) on manifolds, such as how vectors rotate when you change the local reference frame on a surface.
- ā¢They work by representing features in fibers (like tangent spaces) and ensuring every operation (transport, convolution, nonlinearity) transforms correctly under the structure group (e.g., SO(2) on surfaces).
- ā¢Discrete implementations on meshes use parallel transport angles on edges to align neighbor features before aggregation and restrict weights to be intertwiners that do not break equivariance.
- ā¢A practical recipe is to expand features into circular-harmonic types (orders m=0,1,2,...) where rotations act by simple 2Ć2 blocks and to share weights within each order.
- ā¢Nonlinearities must be chosen carefully (e.g., norm-based or gated) so that higher-order vector features remain equivariant.
- ā¢You can numerically verify equivariance by randomly changing gauges per vertex, updating edge transport angles accordingly, and checking outputs rotate exactly as predicted.
- ā¢The computational cost is similar to message-passing GNNs, typically O(E Ā· F) per layer, with small extra trigonometric costs for rotations.
- ā¢Gauge equivariance is essential for learning on surfaces, vector fields, and physics problems where predictions should not depend on arbitrary local frame choices.
Prerequisites
- āBasic group theory ā Understand group actions, representations, and the difference between invariance and equivariance.
- āLinear algebra ā Features live in vector spaces and are transformed by matrices; weights are intertwiners between representations.
- āMultivariable calculus and manifolds (intro) ā Frames, tangent spaces, and the exponential map underpin convolution on manifolds.
- āGraph neural networks (GNNs) ā Gauge-equivariant layers on meshes are implemented as message passing with special alignment and weights.
- āFourier analysis on the circle ā Circular harmonics (cos/sin pairs) are the irreducible representations of SO(2) used to rotate features.
- āDifferential geometry (connections, parallel transport) ā Edge transport angles discretize connections; equivariance relies on correct parallel transport.
- āRepresentation theory (irreps, intertwiners) ā Weight sharing per order and nonlinearity design come from representation-theoretic constraints.
- āNumerical methods on meshes/graphs ā Discretization choices (angles, neighbors) affect stability and correctness.
- āC++ programming (vectors, performance) ā Implementing efficient rotations, aggregations, and tests requires solid C++.
- āTrigonometry ā Rotations use sin/cos and angle normalization; numerical care is needed.
Detailed Explanation
Tap terms for definitions01Overview
Gauge equivariant networks are neural architectures designed to process data that live on manifolds with local symmetries (gauges). Think of a 2D surface like a curved sheet: at every point you can draw a little xāy axis (a tangent frame). There is no globally consistent way to pick all these axes on a curved surface, and you are free to rotate each local frame without changing the underlying geometry. A network is gauge equivariant if its outputs transform predictably when you rotate these local frames. This is crucial when inputs and outputs are not just scalars but geometric objects like tangent vectors or higher-order features that live in fibers of a bundle over the manifold. In practice, we discretize the manifold as a mesh or graph, store a frame (gauge) per node, and store how to align frames along edges (a discrete connection, or parallel transport angle). Features are decomposed into irreducible representations of the gauge group (e.g., circular harmonics for SO(2)), which makes rotations act by tiny block matrices. Layers align neighbor features via parallel transport, apply order-wise shared weights (intertwiners), aggregate, and optionally apply equivariant nonlinearities. With these ingredients, the networkās predictions depend only on the underlying geometry and dataānot on the arbitrary choice of frames.
02Intuition & Analogies
Imagine you are hiking across a hilly terrain (a surface). At every point you carry a tiny compass (your local frame) showing āeastā and ānorth.ā When you walk to a neighbor point, the notion of east/north twists a little, depending on the terrainās curvatureāthis twist is parallel transport. Now suppose you want a rule to combine wind measurements (vectors) from your neighbors to predict the wind at your location. If your friend uses a different brand of compass (rotated by, say, 30° everywhere), both of you should still get predictions that agree up to that same 30° rotation. A network with this property is gauge equivariant. A second analogy: playlists and headphones. You can wear your headphones at any rotation; the song you hear ārotatesā with the headphones, but the underlying melody is unchanged. If two people wear their headphones differently, a device that analyzes the song should produce the same conclusions regardless of headphone angle. In geometry, the headphone angle is the gauge (local frame choice), and the song is the true geometric signal. Operationally, we break features into types that we know how to rotate easilyālike splitting a signal on a circle into sines and cosines. When we rotate the frame by an angle Īø, an m-th order sine/cosine pair just rotates by mĀ·Īø via a 2Ć2 matrix. That makes it cheap to āline upā neighbor features by undoing their local rotation before summing. We then only allow weights that do not mix different types (orders), so the whole computation commutes with every local rotation. The result is a model that learns geometry-aware patterns without being confused by arbitrary frame choices.
03Formal Definition
04When to Use
- Surface or manifold data: Learning from signals on curved surfaces (e.g., textures on 3D meshes, climate fields on the sphere) where directions matter and rotate from point to point.
- Vector/tensor fields: Tasks with inputs/outputs that are not scalarsāwind velocity on Earth, fiber orientations in diffusion MRI, principal curvature directions, or electromagnetic fieldsāwhere predictions must rotate with local frames.
- Physics and PDEs on meshes: When physical laws are covariant and you want models that respect local symmetry, avoiding spurious frame-dependent artifacts.
- Low-data or generalization-critical regimes: Equivariance bakes in strong inductive bias, improving data efficiency and robustness when orientations vary or gauges are arbitrary.
- Non-Euclidean sensor layouts: Graphs from geodesic neighborhoods, point clouds on surfaces, or non-orientable domains where a global orientation is impossible; local gauge handling keeps the model consistent. Avoid gauge-equivariant machinery if: signals are purely scalar and orientation-agnostic (standard GNN/CNN suffices), the domain is flat with a fixed global frame (ordinary SE(2)/SO(3) equivariance may be simpler), or if obtaining a reliable discrete connection (parallel transport) is infeasible/noisy enough to dominate errors.
ā ļøCommon Mistakes
- Confusing invariance with equivariance: Invariance forces outputs not to change; equivariance requires predictable change (e.g., vectors rotate). Using invariant pooling when you need vector outputs discards information.
- Forgetting frame alignment: Aggregating neighbors without parallel transport (or with the wrong sign) breaks equivariance. Always rotate neighbor features into the receiverās frame before combining.
- Mixing representation orders: Weight matrices must not mix different harmonic orders (m); only mix channels within the same order. Cross-order mixing violates the intertwiner constraint and breaks equivariance.
- Using standard nonlinearities on vector/tensor channels: ReLU on each component is not equivariant. Use norm-based, gated, or tensor product nonlinearities designed for the representation.
- Ignoring reflections when the domain/group is O(2) rather than SO(2): If local frames can flip, include reflection parity and use the correct representation; otherwise, predictions will be inconsistent on non-orientable meshes.
- Inconsistent angle conventions: Swapping source/target, using degrees vs radians, or adding instead of subtracting angles in updates like \theta'{u\leftarrow v}=\theta{u\leftarrow v}+\gamma(u)-\gamma(v) yields subtle bugs.
- Numerical drift: Accumulating many transports around loops (holonomy) can cause angle wrap issues; always normalize angles to (ā\pi,\pi] and test loop consistency.
- Overcomplicating kernels: Start with simple radial weights shared per order; only add complex steerable bases once the pipeline is correct and numerically verified.
Key Formulas
Gauge Equivariance
Explanation: This states that if inputs are transformed by a gauge field g via the input representation, the layerās outputs transform by the output representation at each point. The computation commutes with any choice of local frames.
Input Gauge Action
Explanation: A feature section s transforms pointwise under a gauge change g by the input representation. This encodes how components rotate with the frame.
Steerability Constraint
Explanation: The kernel defined in a local frame must transform consistently when we rotate the local coordinates. This ensures the resulting convolution does not depend on the arbitrary frame.
Manifold Convolution (Local Chart)
Explanation: At each point x, integrate a kernel over the tangent space using the exponential map to sample neighbors. With a steerable kernel, this is gauge independent.
SO(2) Irrep (Order m>0)
Explanation: Rotating the frame by Īø rotates a sine/cosine pair of order m by mĪø with a 2Ć2 rotation block. This is the basic building block for feature rotation.
Discrete Gauge-Equivariant Message Passing
Explanation: Neighbor features are first aligned to uās frame via the connection , processed by order-wise weights Φ (intertwiners), and summed. This guarantees equivariance if Φ respects representation blocks.
Gauge Update of Edge Transport (SO(2))
Explanation: When you rotate the frame at each vertex by a gauge field the edge transport angles must be updated accordingly. This preserves the meaning of parallel transport between frames.
Feature Dimension by Orders
Explanation: The total number of scalar components per vertex equals the sum over orders: 1 per scalar channel (m=0) and 2 per vector-like channel (m>0). This is useful for complexity estimates.
Norm-based Equivariant Nonlinearity
Explanation: A function of the norm (invariant under rotations) scales the vector without changing its direction relative to the frame. This preserves equivariance for m>0 channels.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Rotate a feature vector organized by SO(2) orders m=0..L. 5 // For each order m: 6 // - m==0: C_m scalar channels (unchanged by rotation) 7 // - m>0 : C_m pairs (cos,sin) per channel rotated by angle m*theta 8 // Layout per m is contiguous: [m=0 scalars][m=1 (c0,c1,c0',c1'), ...][m=2 ...] 9 10 struct HarmonicLayout { 11 vector<int> C; // C[m] = number of channels at order m 12 vector<int> off; // starting offset for order m 13 vector<int> size; // total scalar components for order m 14 int total = 0; // total feature dimension 15 16 explicit HarmonicLayout(const vector<int>& C_) : C(C_) { 17 int L = (int)C.size() - 1; 18 off.resize(L+1); size.resize(L+1); 19 int cur = 0; 20 for (int m = 0; m <= L; ++m) { 21 int dm = (m==0 ? 1 : 2); 22 off[m] = cur; 23 size[m] = dm * C[m]; 24 cur += size[m]; 25 } 26 total = cur; 27 } 28 }; 29 30 vector<double> rotate_so2(const vector<double>& feat, double theta, const HarmonicLayout& H) { 31 vector<double> out(H.total); 32 for (int m = 0; m < (int)H.C.size(); ++m) { 33 int C_m = H.C[m]; 34 int dm = (m==0 ? 1 : 2); 35 int base = H.off[m]; 36 if (m == 0) { 37 for (int c = 0; c < C_m; ++c) out[base + c] = feat[base + c]; 38 } else { 39 double ang = m * theta; 40 double cs = cos(ang), sn = sin(ang); 41 for (int c = 0; c < C_m; ++c) { 42 double a = feat[base + 2*c + 0]; // cos component 43 double b = feat[base + 2*c + 1]; // sin component 44 // [a'; b'] = R_m(theta) [a; b] 45 out[base + 2*c + 0] = cs * a - sn * b; 46 out[base + 2*c + 1] = sn * a + cs * b; 47 } 48 } 49 } 50 return out; 51 } 52 53 int main() { 54 // Example: orders m=0..2, with C0=1 scalar, C1=2 vector channels, C2=1 second-order channel 55 HarmonicLayout H({1, 2, 1}); 56 cout << "Total feature dim = " << H.total << "\n"; 57 58 // Build a feature: [m=0: s0] [m=1: (a1,b1),(a2,b2)] [m=2: (c1,d1)] 59 vector<double> f(H.total); 60 // m=0 61 f[H.off[0] + 0] = 3.0; 62 // m=1 (two channels) 63 f[H.off[1] + 0] = 1.0; // a1 64 f[H.off[1] + 1] = 0.5; // b1 65 f[H.off[1] + 2] = -0.3; // a2 66 f[H.off[1] + 3] = 2.0; // b2 67 // m=2 (one channel) 68 f[H.off[2] + 0] = 0.7; // c1 69 f[H.off[2] + 1] = -1.2; // d1 70 71 double theta = 0.4; // radians 72 auto fr = rotate_so2(f, theta, H); 73 74 // Rotate back by -theta; should recover original 75 auto f_rr = rotate_so2(fr, -theta, H); 76 77 cout.setf(std::ios::fixed); cout<<setprecision(6); 78 cout << "Original vs. Rotated-back (should match)\n"; 79 double max_err = 0.0; 80 for (int i = 0; i < H.total; ++i) { 81 double err = fabs(f[i] - f_rr[i]); 82 max_err = max(max_err, err); 83 cout << i << ": " << f[i] << " vs " << f_rr[i] << "\n"; 84 } 85 cout << "Max error: " << max_err << "\n"; 86 return 0; 87 } 88
This program defines a layout for SO(2) harmonic orders and rotates features accordingly. Scalar channels (m=0) stay fixed, while each (cos,sin) pair for m>0 is rotated by mĀ·Īø using a 2Ć2 block. The test rotates a feature by Īø and back by āĪø to verify correctness numerically.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct HarmonicLayout { 5 vector<int> C; vector<int> off; vector<int> size; int total=0; 6 explicit HarmonicLayout(const vector<int>& C_) : C(C_) { 7 int L = (int)C.size() - 1; off.resize(L+1); size.resize(L+1); 8 int cur=0; for(int m=0;m<=L;++m){int dm=(m==0?1:2); off[m]=cur; size[m]=dm*C[m]; cur+=size[m];} total=cur; 9 } 10 }; 11 12 vector<double> rotate_so2(const vector<double>& feat, double theta, const HarmonicLayout& H){ 13 vector<double> out(H.total); 14 for(int m=0;m<(int)H.C.size();++m){int Cm=H.C[m];int base=H.off[m]; if(m==0){ 15 for(int c=0;c<Cm;++c) out[base+c]=feat[base+c]; 16 }else{double ang=m*theta; double cs=cos(ang), sn=sin(ang); 17 for(int c=0;c<Cm;++c){double a=feat[base+2*c+0]; double b=feat[base+2*c+1]; 18 out[base+2*c+0]=cs*a - sn*b; out[base+2*c+1]=sn*a + cs*b;} 19 }} return out; 20 } 21 22 // Directed edge with transport angle from v -> u (align v's frame to u's frame) 23 struct Edge { int v; double theta; double w; }; 24 25 // Gauge-equivariant layer: per-order weight matrices (do not mix orders) 26 struct GaugeLayer { 27 HarmonicLayout H_in, H_out; 28 vector<vector<double>> W0; // m=0: [C0_out x C0_in] 29 vector<vector<vector<double>>> Wm; // m>0: per m, [C_out x C_in] 30 31 GaugeLayer(const HarmonicLayout& Hin, const HarmonicLayout& Hout) 32 : H_in(Hin), H_out(Hout) { 33 int L = (int)H_in.C.size() - 1; // assume same L in/out 34 // Initialize weights with small random values 35 std::mt19937 rng(42); std::normal_distribution<double> N(0.0, 0.1); 36 W0.assign(H_out.C[0], vector<double>(H_in.C[0])); 37 for (int i=0;i<H_out.C[0];++i) for (int j=0;j<H_in.C[0];++j) W0[i][j]=N(rng); 38 Wm.resize(L+1); 39 for (int m=1;m<=L;++m){ 40 Wm[m].assign(H_out.C[m], vector<double>(H_in.C[m])); 41 for (int i=0;i<H_out.C[m];++i) for (int j=0;j<H_in.C[m];++j) Wm[m][i][j]=N(rng); 42 } 43 } 44 45 // Forward on a graph: adj[u] lists incoming neighbors v with transport theta_{u<-v} 46 vector<vector<double>> forward(const vector<vector<Edge>>& adj, const vector<vector<double>>& X){ 47 int V = (int)adj.size(); 48 vector<vector<double>> Y(V, vector<double>(H_out.total, 0.0)); 49 for (int u=0; u<V; ++u){ 50 // Accumulators per order (in u's frame) 51 // Start with zero 52 vector<double> acc0(H_out.C[0], 0.0); 53 vector<vector<double>> accm(H_out.C.size()); // m>0: (cos,sin) pairs per out channel 54 for (int m=1; m<(int)H_out.C.size(); ++m) accm[m] = vector<double>(2*H_out.C[m], 0.0); 55 56 for (const Edge& e: adj[u]){ 57 int v = e.v; double theta = e.theta; double w = e.w; 58 // Align neighbor feature to u's frame 59 vector<double> x_align = rotate_so2(X[v], theta, H_in); 60 // m=0: scalar mixing 61 for (int i=0;i<H_out.C[0];++i){ 62 double sum=0.0; 63 for (int j=0;j<H_in.C[0];++j){ 64 sum += W0[i][j] * x_align[H_in.off[0] + j]; 65 } 66 acc0[i] += w * sum; 67 } 68 // m>0: mix channels within the same m; apply weights identically to cos/sin components 69 for (int m=1; m<(int)H_in.C.size(); ++m){ 70 int Cin = H_in.C[m], Cout = H_out.C[m]; 71 int inb = H_in.off[m]; int outb = H_out.off[m]; 72 for (int i=0;i<Cout;++i){ 73 double sum_c=0.0, sum_s=0.0; 74 for (int j=0;j<Cin;++j){ 75 double w_ij = Wm[m][i][j]; 76 double xc = x_align[inb + 2*j + 0]; 77 double xs = x_align[inb + 2*j + 1]; 78 sum_c += w_ij * xc; 79 sum_s += w_ij * xs; 80 } 81 accm[m][2*i+0] += w * sum_c; 82 accm[m][2*i+1] += w * sum_s; 83 } 84 } 85 } 86 // Write accumulators into output feature vector Y[u] 87 for (int i=0;i<H_out.C[0];++i) Y[u][H_out.off[0]+i] = acc0[i]; 88 for (int m=1; m<(int)H_out.C.size(); ++m){ 89 for (int i=0;i<H_out.C[m]; ++i){ 90 Y[u][H_out.off[m] + 2*i + 0] = accm[m][2*i+0]; 91 Y[u][H_out.off[m] + 2*i + 1] = accm[m][2*i+1]; 92 } 93 } 94 } 95 return Y; 96 } 97 }; 98 99 int main(){ 100 // Simple graph: 4 vertices in a chain 0-1-2-3 101 int V = 4; vector<vector<Edge>> adj(V); 102 auto add_dir = [&](int u,int v,double theta,double w){ adj[u].push_back({v,theta,w}); }; 103 104 // Example SO(2) orders: m=0:1 channel, m=1:1 channel (2 comps) => total dim = 3 105 HarmonicLayout Hin({1,1}); HarmonicLayout Hout({1,1}); 106 107 // Edge transports (toy angles), make directed both ways with opposite sign 108 add_dir(0,1, +0.2, 1.0); add_dir(1,0, -0.2, 1.0); 109 add_dir(1,2, -0.1, 1.0); add_dir(2,1, +0.1, 1.0); 110 add_dir(2,3, +0.3, 1.0); add_dir(3,2, -0.3, 1.0); 111 112 // Random input features per vertex 113 std::mt19937 rng(0); std::uniform_real_distribution<double> U(-1.0,1.0); 114 vector<vector<double>> X(V, vector<double>(Hin.total)); 115 for(int u=0;u<V;++u){ 116 for(int i=0;i<Hin.total;++i) X[u][i]=U(rng); 117 } 118 119 GaugeLayer layer(Hin, Hout); 120 auto Y = layer.forward(adj, X); 121 122 cout.setf(std::ios::fixed); cout<<setprecision(4); 123 for(int u=0;u<V;++u){ 124 cout << "Y["<<u<<"] = "; 125 for(double z : Y[u]) cout << z << ' '; 126 cout << '\n'; 127 } 128 return 0; 129 } 130
This program implements a single gauge-equivariant message passing layer on a small directed graph. Each edge stores a parallel transport angle that aligns neighbor features to the receiverās frame. Features are organized by harmonic order; weights mix channels only within the same order (intertwiners), preserving equivariance. The output at each node is the sum of aligned, linearly mixed neighbor features.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct HarmonicLayout { vector<int> C, off, size; int total=0; HarmonicLayout(const vector<int>& C_):C(C_){int L=C.size()-1;off.resize(L+1);size.resize(L+1);int cur=0;for(int m=0;m<=L;++m){int dm=(m==0?1:2);off[m]=cur;size[m]=dm*C[m];cur+=size[m];}total=cur;} }; 5 6 vector<double> rotate_so2(const vector<double>& x, double theta, const HarmonicLayout& H){ vector<double> y(H.total); for(int m=0;m<(int)H.C.size();++m){int Cm=H.C[m], base=H.off[m]; if(m==0){for(int c=0;c<Cm;++c) y[base+c]=x[base+c];} else {double a=m*theta, cs=cos(a), sn=sin(a); for(int c=0;c<Cm;++c){double xc=x[base+2*c+0], xs=x[base+2*c+1]; y[base+2*c+0]=cs*xc - sn*xs; y[base+2*c+1]=sn*xc + cs*xs;}} } return y; } 7 8 struct Edge { int v; double theta; double w; }; 9 10 struct GaugeLayer { 11 HarmonicLayout H; vector<double> W0; vector<double> W1; // simple: same in/out channels and only m=0,1 12 GaugeLayer(const HarmonicLayout& H_) : H(H_) { 13 std::mt19937 rng(123); std::normal_distribution<double> N(0.0, 0.1); 14 W0.resize(H.C[0]*H.C[0]); for(double &z:W0) z=N(rng); 15 W1.resize(H.C[1]*H.C[1]); for(double &z:W1) z=N(rng); 16 } 17 vector<vector<double>> forward(const vector<vector<Edge>>& adj, const vector<vector<double>>& X){ 18 int V=adj.size(); vector<vector<double>> Y(V, vector<double>(H.total,0.0)); 19 for(int u=0;u<V;++u){ 20 // accumulators 21 vector<double> acc0(H.C[0], 0.0); vector<double> acc1(2*H.C[1], 0.0); 22 for(const Edge& e: adj[u]){ 23 auto xa = rotate_so2(X[e.v], e.theta, H); // align into u's frame 24 // m=0 mixing: y0_i += w * sum_j W0[i,j] * x0_j 25 for(int i=0;i<H.C[0];++i){ double s=0.0; for(int j=0;j<H.C[0];++j) s += W0[i*H.C[0]+j] * xa[H.off[0]+j]; acc0[i]+=e.w*s; } 26 // m=1 mixing: apply same weights to cos/sin components 27 for(int i=0;i<H.C[1];++i){ double sc=0.0, ss=0.0; for(int j=0;j<H.C[1];++j){ double w=W1[i*H.C[1]+j]; sc += w * xa[H.off[1]+2*j+0]; ss += w * xa[H.off[1]+2*j+1]; } acc1[2*i+0]+=e.w*sc; acc1[2*i+1]+=e.w*ss; } 28 } 29 // write out 30 for(int i=0;i<H.C[0];++i) Y[u][H.off[0]+i]=acc0[i]; 31 for(int i=0;i<H.C[1];++i){ Y[u][H.off[1]+2*i+0]=acc1[2*i+0]; Y[u][H.off[1]+2*i+1]=acc1[2*i+1]; } 32 } 33 return Y; 34 } 35 }; 36 37 // Apply a random gauge gamma[u] to features and update edge transports: theta' = theta + gamma(u) - gamma(v) 38 void apply_gauge(const vector<double>& gamma, const HarmonicLayout& H, 39 vector<vector<double>>& X, vector<vector<Edge>>& adj){ 40 int V = X.size(); 41 for(int u=0; u<V; ++u){ X[u] = rotate_so2(X[u], gamma[u], H); } 42 for(int u=0; u<V; ++u){ for(Edge& e: adj[u]){ e.theta = e.theta + gamma[u] - gamma[e.v]; } } 43 // normalize angles to (-pi, pi] 44 auto wrap = [](double a){ while(a<=-M_PI) a+=2*M_PI; while(a>M_PI) a-=2*M_PI; return a; }; 45 for(int u=0; u<V; ++u){ for(Edge& e: adj[u]) e.theta = wrap(e.theta); } 46 } 47 48 int main(){ 49 // Graph: 5 nodes in a ring to include nontrivial holonomy 50 int V=5; vector<vector<Edge>> adj(V); 51 auto add_bidir = [&](int u,int v,double th){ adj[u].push_back({v, th, 1.0}); adj[v].push_back({u, -th, 1.0}); }; 52 add_bidir(0,1, 0.20); add_bidir(1,2, -0.10); add_bidir(2,3, 0.25); add_bidir(3,4, -0.15); add_bidir(4,0, 0.05); 53 54 // Orders m=0:1 scalar, m=1:2 vector channels (dim=1+2*2=5) 55 HarmonicLayout H({1,2}); 56 57 // Random features 58 std::mt19937 rng(7); std::uniform_real_distribution<double> U(-1.0,1.0); 59 vector<vector<double>> X(V, vector<double>(H.total)); 60 for(int u=0; u<V; ++u) for(int i=0;i<H.total;++i) X[u][i]=U(rng); 61 62 GaugeLayer layer(H); 63 auto Y1 = layer.forward(adj, X); // baseline output 64 65 // Draw random per-vertex gauge angles 66 vector<double> gamma(V); for(double &g:gamma) g = U(rng); 67 68 // Copy X and adj, apply gauge 69 auto Xg = X; auto adjg = adj; apply_gauge(gamma, H, Xg, adjg); 70 71 // Forward under new gauge 72 auto Y2 = layer.forward(adjg, Xg); 73 74 // Rotate Y2 back by -gamma[u] for comparison with Y1 75 double max_err = 0.0; double sum_err = 0.0; int cnt=0; 76 for(int u=0; u<V; ++u){ auto Y2b = rotate_so2(Y2[u], -gamma[u], H); 77 for(int i=0;i<H.total;++i){ double err=fabs(Y1[u][i]-Y2b[i]); max_err=max(max_err,err); sum_err+=err; ++cnt; } 78 } 79 cout.setf(std::ios::fixed); cout<<setprecision(6); 80 cout << "Equivariance check: max abs error = " << max_err << ", mean abs error = " << (sum_err/cnt) << "\n"; 81 return 0; 82 } 83
This program verifies equivariance numerically. It computes a layer output, then applies a random per-vertex gauge (rotating features and updating edge transport angles) and recomputes. Finally, it rotates the new outputs back by the inverse gauge and compares them to the original outputs. The tiny numerical error demonstrates correct gauge equivariance.