🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
📚TheoryAdvanced

E(n) Equivariant Neural Networks

Key Points

  • •
    E(n)-equivariant neural networks are models whose outputs transform predictably when inputs are rotated, translated, or reflected in n-dimensional Euclidean space.
  • •
    They enforce physics-like symmetries by building layers from relative vectors and distance-based scalars, which guarantees correct transformation behavior.
  • •
    Equivariance is stronger than invariance: invariant outputs stay the same under transformations, while equivariant outputs change in a controlled, known way.
  • •
    Common E(n)-equivariant designs include EGNN-style message passing layers and force fields that depend only on pairwise distances.
  • •
    Using only relative positions (xi​ - xj​) and distance scalars (|∣xi​−xj​∣|) preserves translation and rotation/reflection symmetries.
  • •
    These models shine on molecules, point clouds, robotics, and physical simulations where geometry matters.
  • •
    Mistakes such as injecting absolute coordinates, adding arbitrary biases to vector channels, or mixing scalars with vectors break equivariance.
  • •
    You can test equivariance numerically by applying a random rotation/translation to inputs and checking that outputs rotate/translate in the same way.

Prerequisites

  • →Linear algebra (vectors, matrices, orthogonality) — Rotations/reflections are represented by orthogonal matrices and vectors transform linearly.
  • →Basic group theory — Equivariance and invariance are defined using group actions and representations.
  • →Neural networks (MLPs, activations) — Equivariant layers are built from standard neural components with geometric constraints.
  • →Graph neural networks — Point-cloud equivariant models commonly use message passing on k-NN graphs.
  • →Floating-point arithmetic and numerical stability — Equivariance tests require tolerances, and stable implementations avoid numerical artifacts.
  • →Euclidean geometry and distances — Using invariants like distances and relative positions is central to E(n) symmetry.

Detailed Explanation

Tap terms for definitions

01Overview

Imagine a function that understands shapes the way we do: if you rotate a 3D molecule or slide a 2D point cloud, the meaning shouldn’t change unpredictably. E(n)-equivariant neural networks (where E(n) is the Euclidean group in n dimensions) are built so their outputs transform in a predictable, geometrically correct way under rotations, translations, and reflections. Instead of learning this behavior from data (which is sample-inefficient), the symmetry is enforced directly by design. This typically happens by computing with relative positions (differences between points) and invariant scalars (like distances), and by carefully defining how outputs should transform (as scalars, vectors, or higher-order tensors). The payoff is better generalization, fewer training samples, and outputs that respect physical laws and geometric constraints. Hook → Concept → Example: Hook: If you rotate a map, roads are still connected; your GPS should not get confused. Concept: An E(n)-equivariant model changes its internal vector outputs by exactly the same rotation when you rotate the inputs, and it stays unchanged by translations when you translate the inputs. Example: A force-prediction model for atoms can be built so that if you rotate the molecule, all predicted forces rotate by the same amount, guaranteeing consistent physics. In practice, this is done using message passing over a graph of points, where messages depend on distances and relative directions, and updates are arranged to preserve the symmetry.

02Intuition & Analogies

Think of Lego structures built from connectors: only the distances between studs and how pieces plug together matter, not where the structure sits on your desk or how it’s oriented. If you pick the whole build up and rotate it, all the internal relations stay the same. E(n)-equivariant networks imitate this idea by focusing on relationships (relative vectors and distances) rather than absolute coordinates. Translations disappear when you subtract positions (x_i − x_j). Rotations reflect as simple rotations of every vector quantity, while scalars like distances stay unchanged. Another analogy: Consider a compass and a ruler. The ruler measures distance (invariant under rotation/translation), and the compass arrow points north (a vector that rotates if you turn the map). A good geometric model understands which things are “ruler-like” (scalars) and which are “compass-like” (vectors). E(n)-equivariant layers keep scalars as scalars and ensure vectors transform like vectors under any rotation or reflection. If you reflect a drawing in a mirror, a proper vector arrow flips direction accordingly; equivariant models guarantee the same for their vector outputs. Finally, think of consistent instructions: “Walk toward your friend by the distance between you.” If you both rotate the whole scene, the instruction remains valid and results rotate consistently. Message-passing layers built on distances and relative directions are like these instructions—they produce outputs that move exactly as the inputs are moved.

03Formal Definition

Let G=E(n) be the Euclidean group of rigid motions in Rn: all transformations g(x) = R x + t with R ∈ O(n) (orthogonal matrices, allowing rotations and reflections) and t ∈ Rn. A function f between spaces with group actions is equivariant if applying g before f equals applying a corresponding action after f. Formally, given representations ρX​: G → Aut(X) and ρY​: G → Aut(Y), f is G-equivariant if f(ρX​(g) x) = ρY​(g) f(x) for all g ∈ G and x ∈ X. In E(n)-equivariant neural nets for point clouds, inputs are point sets X = \{xi​\}_{i=1}^{N} ⊂ Rn with optional features hi​. The standard action of g ∈ E(n) on coordinates is g ⋅ xi​=R xi​ + t; scalar features remain unchanged (ρ(hi​)=hi​), vector features are multiplied by R, and higher-order tensors transform via appropriate tensor representations. Equivariant layers are constructed from invariants like \∣xi​−xj​∥ and covariants like (xi​ - xj​). By ensuring computations depend only on invariants for scalar channels and multiply relative vectors by scalar functions of invariants for vector channels, the layer satisfies f(g ⋅ X) = g ⋅ f(X).

04When to Use

  • Molecular modeling: Predict energies (invariant) and forces (equivariant vectors) for atoms in 3D without data augmentation over orientations.
  • Point-cloud perception: Classify objects (invariant label) or predict normals and flow fields (equivariant vectors) in robotics and 3D vision.
  • Physics simulation: Learn interaction laws where outputs (like accelerations or forces) must rotate consistently with the scene and be insensitive to global translation.
  • Robotics and SLAM: Build models that don’t change their belief just because the robot turned its head or moved the camera; vector outputs (e.g., directions) should rotate with the robot.
  • Medical imaging and materials science: Analyze structures where orientation is arbitrary, but geometric relations are essential. Use E(n)-equivariant networks when geometry and rigid motions are part of the problem definition and you want sample-efficient, symmetry-respecting predictions. Prefer simpler invariant heads (pooling) when you only need an orientation-independent scalar. Use full equivariance when predicting geometric quantities (vectors/tensors) or when downstream tasks require consistent directional behavior.

⚠️Common Mistakes

  • Confusing invariance with equivariance: Invariance means f(gx) = f(x); equivariance means f(gx) = g f(x). Predicting a vector field requires equivariance, not invariance.
  • Using absolute coordinates: Injecting raw x_i into scalar networks breaks translation equivariance; always use relative positions (x_i - x_j) and invariant scalars (|x_i - x_j|).
  • Mixing scalars and vectors incorrectly: Adding a learned bias to vector channels or combining them with scalars without geometry-aware rules destroys equivariance. Vector updates must be linear in relative vectors with scalar coefficients that depend only on invariants.
  • Ignoring reflections: Many tasks require O(n) (rotations + reflections), not just SO(n). Using cross products or pseudoscalar constructions can flip incorrectly under reflections.
  • Normalization layers and biases: Applying per-dimension biases or anisotropic normalization to vector channels breaks symmetry; prefer scalar-only normalization or channel-wise isotropic operations.
  • Floating-point tests: Numerical checks of equivariance will show tiny nonzero errors due to finite precision; use tolerances, not exact equality.
  • Overly dense graphs: Full O(N^2) connectivity grows quickly; use k-NN neighborhoods to keep computation tractable while preserving locality.

Key Formulas

Equivariance condition

f(ρX​(g)x)=ρY​(g)f(x)

Explanation: A function f is equivariant if transforming the input by group action ρX​ and then applying f equals applying f and then transforming the output by ρY​. This is the core requirement for E(n)-equivariant networks.

Structure of the Euclidean group

E(n)=Rn⋊O(n)

Explanation: The Euclidean group combines translations Rn and orthogonal transformations O(n) via a semidirect product. It formalizes that rigid motions are rotations/reflections followed by translations.

Group action on coordinates

g⋅x=Rx+t,R∈O(n), t∈Rn

Explanation: A Euclidean transformation maps coordinates by rotation/reflection and translation. This is how inputs are transformed in E(n)-equivariant models.

Distance invariance

∥R(xi​−xj​)∥2=∥xi​−xj​∥2

Explanation: Squared distances between points are unchanged by orthogonal transformations. This justifies using distances as invariant inputs to scalar functions.

Feature update (EGNN-style)

hi′​=ϕh​​hi​, j∈N(i)∑​wij​hj​​,wij​=ϕe​(hi​,hj​,∥xi​−xj​∥2)

Explanation: Node features are updated by aggregating neighbor features scaled by scalar weights that depend only on invariants. This preserves E(n) symmetry for scalar features.

Coordinate update (EGNN-style)

xi′​=xi​+j∈N(i)∑​(xi​−xj​)sij​,sij​=ϕx​(hi​,hj​,∥xi​−xj​∥2)

Explanation: Coordinates are updated by linear combinations of relative vectors with scalar coefficients depending on invariants. This guarantees E(n)-equivariance of vector outputs.

Invariant pooling

Invariant head:y=Pooli​ψ(hi​), e.g., sum/mean

Explanation: Graph-level scalar predictions can be made invariant by pooling node-wise scalar features. Pooling commutes with E(n) actions because features are scalars.

Equivariant force field

Fi​=j=i∑​k(∥xi​−xj​∥2)(xi​−xj​)

Explanation: A force constructed from relative vectors scaled by distance-based scalars is E(n)-equivariant. Under rigid motions, forces rotate/reflect exactly like coordinates.

EGNN layer complexity

O(Nk(F2+dF))

Explanation: For N nodes, k neighbors, feature size F, and dimension d, the main costs are computing scalar weights and feature MLPs per edge and aggregating vectors. This is the typical time complexity of one forward pass.

Numerical equivariance test

∥f(g⋅X)−ρY​(g)f(X)∥≤ε

Explanation: In floating-point arithmetic, we test equivariance up to a small tolerance ε. This accounts for rounding errors and implementation noise.

Complexity Analysis

Consider an EGNN-style layer over a graph of N nodes in dimension d with k neighbors per node and scalar feature width F. For each edge (i, j), we compute invariants (e.g., squared distance), pass them with features through small MLPs to get scalar weights, and aggregate messages. The per-edge cost includes O(F2) multiplications for feature mixing (when φh​ is a two-layer MLP or a linear map) and O(d) work to accumulate coordinate updates via relative vectors. Summed over all O(Nk) edges, the time complexity is O(Nk(F2 + dF)), dominated by feature transformations and vector aggregations; with full connectivity k ≈ N, this becomes O(N2(F2 + dF)). Neighbor search (e.g., k-NN) can add O(N log N) to O(Nk) preprocessing depending on the method, but is often amortized or approximated. Space complexity consists of storing features O(NF), coordinates O(Nd), and adjacency O(Nk). Temporary buffers for messages add O(NkF) if explicitly materialized; streaming aggregations can reduce peak memory to O(NF + Nd + Nk). Parameter memory is small compared to activations when using shallow MLPs per edge. For the force-field example that uses only distance-based scalars and relative vectors, time is O(Nk d) (or O(N2 d) if dense) and space is O(Nd + Nk). In practice, choosing modest k (e.g., 16–64) balances accuracy and efficiency, and batching multiple graphs affects memory linearly with batch size.

Code Examples

EGNN-style layer with equivariance test (3D)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Simple utilities for vectors/matrices in 3D
5struct Vec {
6 double x, y, z;
7};
8
9Vec operator+(const Vec& a, const Vec& b){ return {a.x+b.x, a.y+b.y, a.z+b.z}; }
10Vec operator-(const Vec& a, const Vec& b){ return {a.x-b.x, a.y-b.y, a.z-b.z}; }
11Vec operator*(const Vec& a, double s){ return {a.x*s, a.y*s, a.z*s}; }
12Vec& operator+=(Vec& a, const Vec& b){ a.x+=b.x; a.y+=b.y; a.z+=b.z; return a; }
13
14double dot(const Vec& a, const Vec& b){ return a.x*b.x + a.y*b.y + a.z*b.z; }
15double norm2(const Vec& a){ return dot(a,a); }
16
17// 3x3 matrix for rotations/reflections
18struct Mat3 {
19 double m[3][3];
20};
21
22Vec mul(const Mat3& R, const Vec& v){
23 return {
24 R.m[0][0]*v.x + R.m[0][1]*v.y + R.m[0][2]*v.z,
25 R.m[1][0]*v.x + R.m[1][1]*v.y + R.m[1][2]*v.z,
26 R.m[2][0]*v.x + R.m[2][1]*v.y + R.m[2][2]*v.z
27 };
28}
29
30Mat3 gram_schmidt_orthogonal(mt19937& rng, bool allow_reflection=true){
31 normal_distribution<double> nd(0.0,1.0);
32 // Random matrix A
33 double a[3][3];
34 for(int i=0;i<3;i++) for(int j=0;j<3;j++) a[i][j] = nd(rng);
35 // Gram-Schmidt
36 vector<array<double,3>> v(3), u(3);
37 for(int i=0;i<3;i++){ v[i] = {a[0][i], a[1][i], a[2][i]}; }
38 auto proj = [&](array<double,3> u, array<double,3> v){
39 double udotv = u[0]*v[0] + u[1]*v[1] + u[2]*v[2];
40 double udotu = u[0]*u[0] + u[1]*u[1] + u[2]*u[2];
41 double s = udotv / max(udotu,1e-12);
42 return array<double,3>{u[0]*s, u[1]*s, u[2]*s};
43 };
44 for(int i=0;i<3;i++){
45 array<double,3> w = v[i];
46 for(int j=0;j<i;j++){
47 auto p = proj(u[j], v[i]);
48 w[0] -= p[0]; w[1] -= p[1]; w[2] -= p[2];
49 }
50 double n = sqrt(max(1e-12, w[0]*w[0] + w[1]*w[1] + w[2]*w[2]));
51 u[i] = {w[0]/n, w[1]/n, w[2]/n};
52 }
53 Mat3 R{};
54 for(int i=0;i<3;i++) for(int j=0;j<3;j++) R.m[i][j] = u[j][i]; // columns are u
55 // Adjust determinant if needed
56 double det =
57 R.m[0][0]*(R.m[1][1]*R.m[2][2]-R.m[1][2]*R.m[2][1]) -
58 R.m[0][1]*(R.m[1][0]*R.m[2][2]-R.m[1][2]*R.m[2][0]) +
59 R.m[0][2]*(R.m[1][0]*R.m[2][1]-R.m[1][1]*R.m[2][0]);
60 if(!allow_reflection && det < 0){
61 // Flip one column to make det positive
62 for(int i=0;i<3;i++) R.m[i][2] *= -1.0;
63 }
64 return R;
65}
66
67struct MLP {
68 // Simple 2-layer MLP: in -> hidden -> out with ReLU
69 vector<vector<double>> W1, W2;
70 vector<double> b1, b2;
71 MLP(){}
72 MLP(int in_dim, int hid, int out_dim, mt19937& rng){
73 normal_distribution<double> nd(0.0, 0.5);
74 W1.assign(hid, vector<double>(in_dim));
75 W2.assign(out_dim, vector<double>(hid));
76 b1.assign(hid, 0.0);
77 b2.assign(out_dim, 0.0);
78 for(auto& r: W1) for(auto& v: r) v = nd(rng) / sqrt(in_dim);
79 for(auto& r: W2) for(auto& v: r) v = nd(rng) / sqrt(hid);
80 }
81 vector<double> forward(const vector<double>& x) const{
82 int hid = (int)W1.size(), in_dim = (int)W1[0].size();
83 vector<double> h(hid, 0.0);
84 for(int i=0;i<hid;i++){
85 double s = b1[i];
86 for(int j=0;j<in_dim;j++) s += W1[i][j]*x[j];
87 h[i] = max(0.0, s); // ReLU
88 }
89 int out_dim = (int)W2.size();
90 vector<double> y(out_dim, 0.0);
91 for(int i=0;i<out_dim;i++){
92 double s = b2[i];
93 for(int j=0;j<hid;j++) s += W2[i][j]*h[j];
94 y[i] = s; // linear output
95 }
96 return y;
97 }
98};
99
100struct EGNNLayer {
101 int F; // feature size
102 MLP phi_e; // edge weight scalar
103 MLP phi_h; // feature update
104 MLP phi_x; // coordinate scalar
105 double coord_scale;
106 EGNNLayer(int F_, mt19937& rng, int hidden=32): F(F_), coord_scale(0.1){
107 // Inputs: [h_i (F), h_j (F), d2 (1)] -> scalar
108 phi_e = MLP(2*F + 1, hidden, 1, rng);
109 phi_x = MLP(2*F + 1, hidden, 1, rng);
110 // Inputs to phi_h: [h_i (F), m_i (F)] -> h_i' (F)
111 phi_h = MLP(2*F, hidden, F, rng);
112 }
113
114 void forward(const vector<Vec>& X, const vector<vector<double>>& H,
115 vector<Vec>& Xout, vector<vector<double>>& Hout) const{
116 int N = (int)X.size();
117 Xout.assign(N, {0,0,0});
118 Hout.assign(N, vector<double>(F, 0.0));
119 // Precompute messages
120 vector<vector<double>> m(N, vector<double>(F, 0.0));
121 vector<Vec> dx(N, {0,0,0});
122 for(int i=0;i<N;i++){
123 for(int j=0;j<N;j++) if(i!=j){
124 Vec rij = X[i] - X[j];
125 double d2 = norm2(rij);
126 // Edge scalar weights
127 vector<double> in;
128 in.reserve(2*F+1);
129 in.insert(in.end(), H[i].begin(), H[i].end());
130 in.insert(in.end(), H[j].begin(), H[j].end());
131 in.push_back(d2);
132 double wij = phi_e.forward(in)[0];
133 double sij = phi_x.forward(in)[0];
134 // Feature message: scale neighbor feature by scalar weight
135 for(int f=0; f<F; f++) m[i][f] += wij * H[j][f];
136 // Coordinate update: linear in relative vector with scalar coeff
137 dx[i] += rij * (coord_scale * sij);
138 }
139 }
140 for(int i=0;i<N;i++){
141 // Feature update
142 vector<double> hin;
143 hin.reserve(2*F);
144 hin.insert(hin.end(), H[i].begin(), H[i].end());
145 hin.insert(hin.end(), m[i].begin(), m[i].end());
146 Hout[i] = phi_h.forward(hin);
147 // Coordinate update is equivariant by design
148 Xout[i] = {X[i].x + dx[i].x, X[i].y + dx[i].y, X[i].z + dx[i].z};
149 }
150 }
151};
152
153int main(){
154 ios::sync_with_stdio(false);
155 cin.tie(nullptr);
156
157 mt19937 rng(42);
158 uniform_real_distribution<double> ud(-1.0, 1.0);
159
160 int N = 6; int F = 4;
161 vector<Vec> X(N);
162 vector<vector<double>> H(N, vector<double>(F));
163 for(int i=0;i<N;i++){
164 X[i] = {ud(rng), ud(rng), ud(rng)};
165 for(int f=0; f<F; f++) H[i][f] = ud(rng);
166 }
167
168 EGNNLayer layer(F, rng, 32);
169
170 vector<Vec> X1; vector<vector<double>> H1;
171 layer.forward(X, H, X1, H1);
172
173 // Apply a random orthogonal transform and translation
174 Mat3 R = gram_schmidt_orthogonal(rng, /*allow_reflection=*/true);
175 Vec t = {0.7, -0.3, 0.2};
176
177 vector<Vec> Xg(N);
178 for(int i=0;i<N;i++) Xg[i] = mul(R, X[i]) + t; // g·X
179
180 vector<Vec> X2; vector<vector<double>> H2;
181 layer.forward(Xg, H, X2, H2); // same H (scalar features don't transform)
182
183 // Transform original output: g·(X1)
184 vector<Vec> X1g(N);
185 for(int i=0;i<N;i++) X1g[i] = mul(R, X1[i]) + t;
186
187 // Compare X2 vs X1g and H2 vs H1 (features are scalars -> invariant)
188 auto max_abs = [](double a, double b){ return fabs(a)>fabs(b)?a:b; };
189 double pos_err = 0.0, feat_err = 0.0;
190 for(int i=0;i<N;i++){
191 pos_err = max(pos_err, fabs(X2[i].x - X1g[i].x));
192 pos_err = max(pos_err, fabs(X2[i].y - X1g[i].y));
193 pos_err = max(pos_err, fabs(X2[i].z - X1g[i].z));
194 for(int f=0; f<F; f++) feat_err = max(feat_err, fabs(H2[i][f] - H1[i][f]));
195 }
196
197 cout.setf(std::ios::fixed); cout<<setprecision(6);
198 cout << "Max coordinate equivariance error: " << pos_err << "\n";
199 cout << "Max feature invariance error: " << feat_err << "\n";
200 cout << "(Small nonzero errors are expected due to floating-point and random weights.)\n";
201
202 return 0;
203}
204

This example implements a minimal EGNN-style layer. Scalar edge weights depend only on invariant inputs (features and squared distances). Coordinate updates are linear in relative vectors with scalar coefficients, guaranteeing E(n)-equivariance. We numerically test equivariance by applying a random orthogonal transform (rotation/reflection) and a translation to the inputs and verifying that the outputs transform accordingly. Features are treated as scalars (unchanged by E(n)); coordinates transform by R and t.

Time: O(N^2 (F^2 + dF)) for dense connectivity; with k-NN it becomes O(Nk (F^2 + dF)).Space: O(NF + Nd + N^2) in dense mode (adjacency implicit); can be reduced to O(NF + Nd + Nk) with sparse neighborhoods.
E(n)-equivariant force field from distance-based kernel
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Vec { double x,y,z; };
5Vec operator+(const Vec& a, const Vec& b){ return {a.x+b.x, a.y+b.y, a.z+b.z}; }
6Vec operator-(const Vec& a, const Vec& b){ return {a.x-b.x, a.y-b.y, a.z-b.z}; }
7Vec operator*(const Vec& a, double s){ return {a.x*s, a.y*s, a.z*s}; }
8Vec& operator+=(Vec& a, const Vec& b){ a.x+=b.x; a.y+=b.y; a.z+=b.z; return a; }
9
10double dot(const Vec& a, const Vec& b){ return a.x*b.x + a.y*b.y + a.z*b.z; }
11double norm2(const Vec& a){ return dot(a,a); }
12
13struct Mat3 { double m[3][3]; };
14Vec mul(const Mat3& R, const Vec& v){
15 return { R.m[0][0]*v.x + R.m[0][1]*v.y + R.m[0][2]*v.z,
16 R.m[1][0]*v.x + R.m[1][1]*v.y + R.m[1][2]*v.z,
17 R.m[2][0]*v.x + R.m[2][1]*v.y + R.m[2][2]*v.z };
18}
19
20Mat3 random_orthogonal(mt19937& rng, bool allow_reflection=true){
21 normal_distribution<double> nd(0.0,1.0);
22 // Gram-Schmidt
23 double a[3][3]; for(int i=0;i<3;i++) for(int j=0;j<3;j++) a[i][j]=nd(rng);
24 array<double,3> u[3];
25 auto norm = [&](array<double,3> v){ return sqrt(max(1e-12, v[0]*v[0]+v[1]*v[1]+v[2]*v[2])); };
26 auto proj = [&](array<double,3> u, array<double,3> v){
27 double s = (u[0]*v[0]+u[1]*v[1]+u[2]*v[2]) / max(1e-12, u[0]*u[0]+u[1]*u[1]+u[2]*u[2]);
28 return array<double,3>{u[0]*s,u[1]*s,u[2]*s}; };
29 array<double,3> v[3] = { array<double,3>{a[0][0],a[1][0],a[2][0]},
30 array<double,3>{a[0][1],a[1][1],a[2][1]},
31 array<double,3>{a[0][2],a[1][2],a[2][2]} };
32 for(int i=0;i<3;i++){
33 auto w = v[i];
34 for(int j=0;j<i;j++){ auto p = proj(u[j], v[i]); w[0]-=p[0]; w[1]-=p[1]; w[2]-=p[2]; }
35 double n = norm(w); u[i] = {w[0]/n, w[1]/n, w[2]/n};
36 }
37 Mat3 R{}; for(int i=0;i<3;i++) for(int j=0;j<3;j++) R.m[i][j] = u[j][i];
38 double det = R.m[0][0]*(R.m[1][1]*R.m[2][2]-R.m[1][2]*R.m[2][1])
39 - R.m[0][1]*(R.m[1][0]*R.m[2][2]-R.m[1][2]*R.m[2][0])
40 + R.m[0][2]*(R.m[1][0]*R.m[2][1]-R.m[1][1]*R.m[2][0]);
41 if(!allow_reflection && det < 0){ for(int i=0;i<3;i++) R.m[i][2] *= -1.0; }
42 return R;
43}
44
45struct RadialMLP {
46 // k(d2) = w2 * ReLU(w1 * [d2, d2^0.5, 1]) + b2 (scalar output)
47 array<double,3> w1; double b1; double w2; double b2;
48 RadialMLP(mt19937& rng){
49 normal_distribution<double> nd(0.0, 0.5);
50 for(int i=0;i<3;i++) w1[i] = nd(rng);
51 b1 = 0.0; w2 = nd(rng); b2 = 0.0;
52 }
53 double forward(double d2) const{
54 double d = sqrt(max(1e-12, d2));
55 double h = w1[0]*d2 + w1[1]*d + w1[2]*1.0 + b1;
56 h = max(0.0, h); // ReLU ensures non-negativity
57 return w2 * h + b2;
58 }
59};
60
61vector<Vec> compute_forces(const vector<Vec>& X, const RadialMLP& phi){
62 int N = (int)X.size();
63 vector<Vec> F(N, {0,0,0});
64 for(int i=0;i<N;i++){
65 Vec fi{0,0,0};
66 for(int j=0;j<N;j++) if(i!=j){
67 Vec rij = {X[i].x - X[j].x, X[i].y - X[j].y, X[i].z - X[j].z};
68 double d2 = rij.x*rij.x + rij.y*rij.y + rij.z*rij.z;
69 double kij = phi.forward(d2);
70 fi += rij * kij; // linear in relative vector, scalar depends on invariants
71 }
72 F[i] = fi;
73 }
74 return F;
75}
76
77int main(){
78 mt19937 rng(7);
79 uniform_real_distribution<double> ud(-1.0, 1.0);
80 int N=8; vector<Vec> X(N);
81 for(int i=0;i<N;i++) X[i] = {ud(rng), ud(rng), ud(rng)};
82
83 RadialMLP phi(rng);
84 vector<Vec> F = compute_forces(X, phi);
85
86 // Apply random rigid motion
87 Mat3 R = random_orthogonal(rng, true);
88 Vec t = {0.3, -0.4, 0.1};
89 vector<Vec> Xg(N); for(int i=0;i<N;i++) Xg[i] = {R.m[0][0]*X[i].x + R.m[0][1]*X[i].y + R.m[0][2]*X[i].z + t.x,
90 R.m[1][0]*X[i].x + R.m[1][1]*X[i].y + R.m[1][2]*X[i].z + t.y,
91 R.m[2][0]*X[i].x + R.m[2][1]*X[i].y + R.m[2][2]*X[i].z + t.z};
92 vector<Vec> Fg = compute_forces(Xg, phi);
93
94 // Transform original forces by R and compare to Fg
95 auto mul = [&](const Vec& v){ return Vec{R.m[0][0]*v.x + R.m[0][1]*v.y + R.m[0][2]*v.z,
96 R.m[1][0]*v.x + R.m[1][1]*v.y + R.m[1][2]*v.z,
97 R.m[2][0]*v.x + R.m[2][1]*v.y + R.m[2][2]*v.z}; };
98 double max_err = 0.0;
99 for(int i=0;i<N;i++){
100 Vec RF = mul(F[i]);
101 max_err = max(max_err, fabs(RF.x - Fg[i].x));
102 max_err = max(max_err, fabs(RF.y - Fg[i].y));
103 max_err = max(max_err, fabs(RF.z - Fg[i].z));
104 }
105 cout.setf(std::ios::fixed); cout<<setprecision(6);
106 cout << "Max force equivariance error: " << max_err << "\n";
107 cout << "(Small nonzero errors expected due to floating point.)\n";
108 return 0;
109}
110

This program defines an equivariant force field on a set of 3D points. Forces are sums of relative vectors scaled by a distance-based scalar kernel, which is invariant under E(n). Under any rigid motion (rotation/reflection + translation), the forces rotate or reflect exactly like the inputs, which we verify numerically.

Time: O(N^2 d) for dense all-pairs interactions; with k-NN neighbors, O(Nk d).Space: O(Nd).
#e(n)-equivariance#euclidean group#so(n) and o(n)#point cloud#message passing#egnn#equivariant gnn#invariant pooling#molecular modeling#force prediction#graph neural network#relative position#distance invariant#rotation translation reflection#symmetry