🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
📚TheoryIntermediate

Group Convolution

Key Points

  • •
    Group convolution combines two functions defined on a group by summing over products aligned by the group operation, generalizing the usual circular convolution on integers modulo n.
  • •
    For a finite group G, the definition is (f *_G k)(g) = ∑h∈G​ f(h)\,k(g−1h), which uses the inverse of g to 'shift' k before multiplying by f and summing.
  • •
    On cyclic groups Cn​, group convolution is exactly circular convolution and can be computed efficiently with the FFT in O(n log n) time.
  • •
    On general (non-abelian) groups, convolution is still associative and has an identity (the delta at the identity element), but it need not be commutative.
  • •
    The group Fourier transform turns convolution into matrix (or scalar, in the abelian case) multiplication: f∗G​k​(ρ) = \hat f(ρ) \hat k(ρ).
  • •
    A naive implementation over a finite group of size |G| costs O(|G|^2) time and O(|G|) extra space; precomputing multiplication and inverses helps avoid mistakes and speeds lookups.
  • •
    Group convolution models random walks on groups: convolving a step distribution with itself t times gives the t-step distribution.
  • •
    In machine learning and signal processing, group convolution is the backbone of equivariant architectures (e.g., rotations, permutations), enabling weight sharing across symmetries.

Prerequisites

  • →Basic group theory — Understanding identities, inverses, and group multiplication is essential to interpret the convolution formula.
  • →Discrete convolution on integers — Provides the familiar sliding-and-summing intuition that generalizes to groups.
  • →Fourier transform and FFT — Enables fast convolution on abelian groups and explains the convolution theorem.
  • →Permutations and composition — Needed to implement non-abelian groups like S3 using permutation composition.
  • →Probability basics — Interpreting convolution powers as t-step distributions in random walks requires normalization and expectation concepts.
  • →C++ vectors, maps, and complex numbers — Required to implement efficient data structures, FFTs, and numerical operations in code.

Detailed Explanation

Tap terms for definitions

01Overview

Group convolution is a way to blend two functions that live on a group so that the result respects the group’s symmetry structure. If you already know ordinary convolution on the integers (sliding, flipping, and summing), group convolution is the same idea but performed with the group’s multiplication instead of integer addition. Given a finite group G and two functions f, k: G → R (or C), their convolution produces a new function f *_G k: G → R defined by summing products of values of f and a shifted version of k. The shift is expressed using the group operation and an inverse, ensuring the operation behaves well under left translations. This operation is central in many fields: in signal processing on cyclic groups (circular convolution), in probability on groups (random walks and mixing), in representation theory (diagonalization by the group Fourier transform), and in machine learning (equivariant neural networks over symmetries like rotations or permutations). For abelian groups such as C_n, powerful FFT algorithms make convolution extremely fast, reducing complexity from quadratic to near-linear. For non-abelian groups, convolution remains associative and has an identity element (the delta at the group identity), but is generally non-commutative; its spectral analysis uses matrix-valued irreducible representations.

02Intuition & Analogies

Think of a group as a set of moves you can make—like rotating a shape, permuting cards, or adding hours on a clock. A function on a group assigns a number to each move: for example, how much we “like” that move, or the probability we take that move in a random walk. Convolution asks: if I first pick a move according to f, then follow it by a move according to k (but aligned relative to a target move g), what is the overall compatibility? To measure this at g, we consider all possible first moves h, and then choose the second move so that together they land at g. On a clock (the cyclic group C_n), this becomes the familiar circular convolution: to score position g, sum over all ways to reach g by first going to h then moving from h to g. On permutations (like shuffling three items), convolution tallies all two-step shuffles whose product equals a target shuffle g. This is why convolution naturally models multi-step processes: applying distributions step-by-step is equivalent to convolving them. The inverse in the formula is the bookkeeping that makes “shift by g” line up properly with left-multiplication. It also ensures beautiful properties: convolving with the delta at the identity does nothing; associativity matches doing three steps in any grouping; and with Fourier analysis on the group, this complicated sum turns into simple multiplication (scalar for abelian groups, matrix for non-abelian groups), the same magic that powers the FFT for ordinary signals.

03Formal Definition

Let G be a finite group with identity e. For functions f, k: G → C, their (left) group convolution is defined by (f *_G k)(g) = ∑h∈G​ f(h)\,k(g−1 h). Here the argument g−1 h means we left-translate k by g so its value at h is compared with f(h). This choice makes convolution compatible with left actions: for any x ∈ G, Lx​(f) defined by Lx​(f)(g) = f(x−1 g) satisfies Lx​(f *_G k) = (Lx​ f) *_G k. The delta function at the identity, δe​(g) = 1 if g=e and 0 otherwise, is the identity for convolution: δe​ *_G f=f *_G δe​ = f. Convolution is associative: (f *_G k) *_G m=f *_G (k *_G m). It is commutative if and only if G is abelian. The inner product ⟨ f, k ⟩ = ∑g∈G​ f(g)​ k(g) induces the adjoint relation ⟨ f *_G k, m ⟩ = ⟨ f, m *_G k~ ⟩ with k~(g) = k(g−1)​. For abelian G, the group characters χ diagonalize convolution: f∗G​k​(χ) = \hat f(χ)\,\hat k(χ). For general G, irreducible representations ρ yield matrix-valued Fourier transforms \hat f(ρ) = ∑g∈G​ f(g) ρ(g), and convolution becomes matrix multiplication: f∗G​k​(ρ) = \hat f(ρ)\,\hat k(ρ).

04When to Use

  • Signals on periodic domains: For data defined on a cycle (e.g., time on a clock, angles discretized into n bins), use group convolution on C_n to implement circular filtering without boundary artifacts. FFT-based methods give O(n \log n) performance.
  • Random walks and Markov chains on groups: If a step distribution p on G defines one-step movement, then the t-step distribution is p^{(*t)} (convolved with itself t times). This is key for mixing time analysis and sampling.
  • Symmetry-aware machine learning: Group convolutions encode equivariance—outputs transform predictably when inputs are transformed—crucial for rotation/permutation-equivariant CNNs, spherical CNNs, and graph models built on Cayley graphs.
  • Counting and combinatorics with symmetry: Convolution tallies compositions of group-labeled structures and appears in Burnside/Polya contexts via class functions and characters.
  • Spectral methods: When you can exploit the group Fourier transform (characters for abelian groups, irreps for general groups), convolution becomes pointwise (or matrix) multiplication in the frequency domain, enabling fast algorithms, denoising, and deconvolution.
  • Physical systems with symmetry: Lattices with periodic boundary, molecular symmetries, or robotics orientation spaces often discretize to finite groups where group convolution is the natural filtering operation.

⚠️Common Mistakes

  • Mixing left and right conventions: The definition here uses k(g^{-1} h). Using k(h g^{-1}) is a right-convolution variant. Be consistent throughout your code and proofs.
  • Forgetting the inverse: Writing (f *_G k)(g) = \sum_h f(h) k(gh) breaks equivariance to left translations. The inverse is essential.
  • Confusing linear vs circular convolution: On C_n, group convolution is circular. Padding and using a length-2n FFT computes linear convolution and will wrap incorrectly if you expect group convolution.
  • Mismatched element orderings: Arrays representing f and k must share the same element ordering and group law. Precompute and reuse a single multiplication table and inverse map.
  • Assuming commutativity: For non-abelian groups, f *_G k generally ≠ k *_G f. Do not reorder factors in code or algebra unless the group is abelian.
  • Numerical issues in FFT: Not using complex numbers, forgetting 1/n scaling on the inverse FFT, or using non power-of-two sizes with a radix-2 FFT cause errors.
  • Not normalizing probabilities: In random-walk applications, ensure \sum_g f(g) = 1 and preserve normalization after convolution.
  • Inefficient lookups: Recomputing products or inverses inside double loops is costly. Precompute mul[g][h] and inv[g] to achieve tight O(|G|^2) performance.

Key Formulas

Group Convolution (Left)

(f∗G​k)(g)=h∈G∑​f(h)k(g−1h)

Explanation: Defines convolution of two functions on a finite group. The inverse aligns k relative to g so the operation is left-equivariant.

Delta at Identity

δe​(g)={1,0,​g=eotherwise​

Explanation: This function acts as the multiplicative identity for convolution: convolving with it leaves any function unchanged.

Associativity

(f∗G​k)∗G​m=f∗G​(k∗G​m)

Explanation: Group convolution can be grouped in any order, matching the idea of composing three steps regardless of parenthesization.

Commutativity Criterion

f∗G​k=k∗G​f⟺G is abelian

Explanation: Convolution commutes exactly when the underlying group’s multiplication commutes.

Convolution Theorem (Abelian)

f∗G​k​(χ)=f^​(χ)k^(χ)

Explanation: For abelian groups, the character transform converts convolution into pointwise multiplication, enabling FFT-based acceleration.

Convolution Theorem (Non-Abelian)

f∗G​k​(ρ)=f^​(ρ)k^(ρ),f^​(ρ)=g∈G∑​f(g)ρ(g)

Explanation: In the non-abelian case, the Fourier transform is matrix-valued; convolution becomes matrix multiplication in each irrep block.

Plancherel (Finite Groups)

g∈G∑​∣f(g)∣2=∣G∣1​ρ∈G∑​dρ​∥f^​(ρ)∥F2​

Explanation: Energy is preserved between spatial and spectral domains up to representation dimensions d_ρ. This underpins Parseval-like identities on groups.

Convolution Power

(f(∗t))(g)=t times(f∗G​f∗G​⋯∗G​f)​​(g)

Explanation: Repeated application of the same step distribution models t-step processes such as random walks on groups.

FFT Complexity (Cyclic)

T(n)=O(nlogn)

Explanation: Computing circular convolution on Cn​ via the FFT takes near-linear time in n, much faster than the naive O(n2) sum.

Complexity Analysis

Let ∣G∣ = N. A direct implementation of group convolution computes (f *_G k)(g) for each g by summing over all h in G, leading to N outputs each requiring N multiplications/additions. This yields time complexity O(N2) and space complexity O(N) for the output (plus O(N) to store inverses and O(N2) if you cache a full multiplication table). If group products are expensive (e.g., composing permutations), precomputing a multiplication table mul[g][h] and inverses inv[g] reduces inner-loop costs to O(1) lookups at the expense of O(N2) memory and O(N2) preprocessing time. For abelian groups with a known FFT (e.g., cyclic Cn​), convolution can be accelerated using the convolution theorem: transform both signals, multiply pointwise, and inverse transform. This reduces time to O(N log N) and space to O(N) (to hold transforms). The constants are small and highly optimized in practice. For groups that factor as products of cyclic groups, multidimensional FFTs still achieve O(N log N). For non-abelian groups, the group Fourier transform produces blocks of sizes d_ρ, one per irrep ρ, and convolution becomes matrix multiplications: f∗G​k​(ρ) = \hat f(ρ) \hat k(ρ). The arithmetic cost is roughly O\big(∑_ρ d_\rho^3\big) to multiply blocks, plus the cost to compute transforms. Specialized “group FFTs” (e.g., for symmetric or dihedral groups) can achieve subquadratic to near-linear runtimes, but general-purpose implementations are complex and have larger constants. In many small or moderate-size non-abelian groups, the naive O(N2) method with table lookups is competitive and simpler to implement accurately.

Code Examples

Circular (group) convolution on the cyclic group C_n using FFT
1#include <bits/stdc++.h>
2using namespace std;
3
4// Iterative radix-2 FFT. Requires n to be a power of two.
5void fft(vector<complex<double>>& a, bool invert) {
6 int n = (int)a.size();
7 static vector<int> rev;
8 static vector<complex<double>> roots{0, 1};
9
10 if ((int)rev.size() != n) {
11 int k = __builtin_ctz(n);
12 rev.assign(n, 0);
13 for (int i = 0; i < n; i++) {
14 rev[i] = (rev[i >> 1] >> 1) | ((i & 1) << (k - 1));
15 }
16 }
17 if ((int)roots.size() < n) {
18 int k = __builtin_ctz(roots.size());
19 roots.resize(n);
20 while ((1 << k) < n) {
21 double angle = 2 * M_PI / (1 << (k + 1));
22 for (int i = 1 << (k - 1); i < (1 << k); i++) {
23 roots[2 * i] = roots[i];
24 double ang = angle * (2 * i + 1 - (1 << k));
25 roots[2 * i + 1] = complex<double>(cos(ang), sin(ang));
26 }
27 k++;
28 }
29 }
30
31 for (int i = 0; i < n; i++) if (i < rev[i]) swap(a[i], a[rev[i]]);
32
33 for (int len = 1; len < n; len <<= 1) {
34 for (int i = 0; i < n; i += 2 * len) {
35 for (int j = 0; j < len; j++) {
36 complex<double> u = a[i + j];
37 complex<double> v = a[i + j + len] * roots[len + j];
38 a[i + j] = u + v;
39 a[i + j + len] = u - v;
40 }
41 }
42 }
43 if (invert) {
44 reverse(a.begin() + 1, a.end());
45 for (auto &x : a) x /= n;
46 }
47}
48
49bool is_power_of_two(size_t n) {
50 return n && ((n & (n - 1)) == 0);
51}
52
53// Circular convolution on C_n: (f * k)[i] = sum_j f[j] * k[(i - j) mod n]
54vector<double> circular_convolution(const vector<double>& f, const vector<double>& k) {
55 size_t n = f.size();
56 if (k.size() != n) throw runtime_error("Inputs must have same size");
57 if (!is_power_of_two(n)) throw runtime_error("n must be a power of two for this FFT implementation");
58
59 vector<complex<double>> Fa(n), Ka(n);
60 for (size_t i = 0; i < n; i++) {
61 Fa[i] = complex<double>(f[i], 0.0);
62 Ka[i] = complex<double>(k[i], 0.0);
63 }
64 fft(Fa, false);
65 fft(Ka, false);
66 for (size_t i = 0; i < n; i++) Fa[i] *= Ka[i];
67 fft(Fa, true); // inverse FFT (with 1/n scaling inside)
68
69 vector<double> out(n);
70 for (size_t i = 0; i < n; i++) out[i] = Fa[i].real();
71 return out;
72}
73
74int main() {
75 // Example on C_8 (size 8, power of two): simple smoothing kernel
76 vector<double> f = {1,2,3,4,3,2,1,0};
77 vector<double> k = {0.25, 0.5, 0.25, 0, 0, 0, 0, 0}; // local averaging (mod 8)
78
79 vector<double> conv = circular_convolution(f, k);
80
81 cout.setf(ios::fixed); cout << setprecision(6);
82 cout << "Circular convolution on C_8:\n";
83 for (double x : conv) cout << x << ' ';
84 cout << "\n";
85 return 0;
86}
87

Implements circular convolution on the cyclic group C_n using an iterative radix-2 FFT. The group law is addition mod n, so (f * k)[i] = sum_j f[j] k[(i - j) mod n]. The FFT converts convolution into pointwise multiplication of spectra, then the inverse FFT returns the result. This matches the group-convolution definition with g^{-1}h becoming (i - j) mod n in the abelian additive notation.

Time: O(n \log n)Space: O(n)
Generic group convolution on S3 using permutations (no hard-coded Cayley table)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Represent a permutation on {0,1,2} as a vector p where p[i] is the image of i.
5using Perm = array<int,3>;
6
7Perm compose(const Perm& a, const Perm& b) {
8 // Composition a ∘ b: first apply b, then a. (a∘b)(i) = a[b[i]]
9 return Perm{ a[b[0]], a[b[1]], a[b[2]] };
10}
11
12Perm inverse(const Perm& p) {
13 Perm inv{};
14 for (int i = 0; i < 3; i++) inv[p[i]] = i;
15 return inv;
16}
17
18struct FiniteGroup {
19 vector<Perm> elems; // list of elements
20 unordered_map<string,int> id_of; // map serialized perm -> index
21 vector<vector<int>> mul; // mul[i][j] = index of elems[i] * elems[j]
22 vector<int> inv; // inv[i] = index of inverse of elems[i]
23
24 static string key(const Perm& p){ return string{char('0'+p[0]), char('0'+p[1]), char('0'+p[2])}; }
25
26 static FiniteGroup S3() {
27 // Generators: s = (0 1), t = (1 2)
28 Perm e{0,1,2};
29 Perm s{1,0,2};
30 Perm t{0,2,1};
31
32 FiniteGroup G;
33 queue<Perm> q;
34 auto add = [&](const Perm& p) {
35 string k = key(p);
36 if (!G.id_of.count(k)) {
37 int id = (int)G.elems.size();
38 G.id_of[k] = id; G.elems.push_back(p); q.push(p);
39 }
40 };
41 add(e); add(s); add(t);
42 // Close under composition by generators until no growth
43 while (!q.empty()) {
44 Perm a = q.front(); q.pop();
45 for (const Perm& g : {s, t}) {
46 Perm ag = compose(a, g);
47 Perm ga = compose(g, a);
48 add(ag); add(ga);
49 }
50 }
51 int n = (int)G.elems.size();
52 G.mul.assign(n, vector<int>(n));
53 G.inv.assign(n, -1);
54 for (int i = 0; i < n; i++) {
55 // inverse index
56 Perm pinv = inverse(G.elems[i]);
57 G.inv[i] = G.id_of[key(pinv)];
58 for (int j = 0; j < n; j++) {
59 Perm c = compose(G.elems[i], G.elems[j]);
60 G.mul[i][j] = G.id_of[key(c)];
61 }
62 }
63 return G;
64 }
65};
66
67// Group convolution: (f *_G k)(g) = sum_h f(h) * k(g^{-1} h)
68vector<double> group_convolution(const FiniteGroup& G, const vector<double>& f, const vector<double>& k) {
69 int n = (int)G.elems.size();
70 if ((int)f.size() != n || (int)k.size() != n) throw runtime_error("Size mismatch");
71 vector<double> out(n, 0.0);
72 for (int g = 0; g < n; g++) {
73 int ginv = G.inv[g];
74 double acc = 0.0;
75 for (int h = 0; h < n; h++) {
76 int idx = G.mul[ginv][h]; // index of g^{-1} * h
77 acc += f[h] * k[idx];
78 }
79 out[g] = acc;
80 }
81 return out;
82}
83
84int main(){
85 FiniteGroup G = FiniteGroup::S3();
86 int n = (int)G.elems.size();
87 cout << "|S3| = " << n << "\n"; // should be 6
88
89 // Define two functions on S3 by index; for demo pick simple values
90 vector<double> f(n, 0.0), k(n, 0.0);
91 f[0] = 1.0; // delta at identity
92 // Let k assign weights: k[0]=1, others small
93 for (int i = 0; i < n; i++) k[i] = (i==0 ? 1.0 : 0.1*i);
94
95 // delta_e * k == k
96 vector<double> conv1 = group_convolution(G, f, k);
97 cout << "delta_e * k: "; for (double x: conv1) cout << fixed << setprecision(3) << x << ' '; cout << "\n";
98
99 // General convolution f2 * k
100 vector<double> f2(n); iota(f2.begin(), f2.end(), 1.0); // f2[i] = 1,2,3,4,5,6
101 vector<double> conv2 = group_convolution(G, f2, k);
102 cout << "f2 * k: "; for (double x: conv2) cout << fixed << setprecision(3) << x << ' '; cout << "\n";
103
104 return 0;
105}
106

Builds S3 from permutation generators without hard-coding a Cayley table, then computes (f *_G k)(g) = sum_h f(h) k(g^{-1} h) using precomputed multiplication and inverses. The example verifies that convolving with the delta at the identity leaves k unchanged and shows a generic convolution output.

Time: O(|G|^2)Space: O(|G|^2) for the multiplication table, O(|G|) for functions
Convolution powers for random walks on S3 via exponentiation by squaring
1#include <bits/stdc++.h>
2using namespace std;
3
4using Perm = array<int,3>;
5Perm compose(const Perm& a, const Perm& b){ return Perm{ a[b[0]], a[b[1]], a[b[2]] }; }
6Perm inverse(const Perm& p){ Perm inv{}; for(int i=0;i<3;i++) inv[p[i]] = i; return inv; }
7
8struct FiniteGroup {
9 vector<Perm> elems; unordered_map<string,int> id_of; vector<vector<int>> mul; vector<int> inv;
10 static string key(const Perm& p){ return string{char('0'+p[0]), char('0'+p[1]), char('0'+p[2])}; }
11 static FiniteGroup S3(){
12 Perm e{0,1,2}, s{1,0,2}, t{0,2,1};
13 FiniteGroup G; queue<Perm> q;
14 auto add = [&](const Perm& p){ string k=key(p); if(!G.id_of.count(k)){ int id=G.elems.size(); G.id_of[k]=id; G.elems.push_back(p); q.push(p);} };
15 add(e); add(s); add(t);
16 while(!q.empty()){ Perm a=q.front(); q.pop(); for(const Perm& g: {s,t}){ auto ag=compose(a,g); auto ga=compose(g,a); add(ag); add(ga);} }
17 int n=G.elems.size(); G.mul.assign(n, vector<int>(n)); G.inv.assign(n,-1);
18 for(int i=0;i<n;i++){ Perm pinv=inverse(G.elems[i]); G.inv[i]=G.id_of[key(pinv)]; for(int j=0;j<n;j++){ Perm c=compose(G.elems[i], G.elems[j]); G.mul[i][j]=G.id_of[key(c)]; }}
19 return G;
20 }
21};
22
23vector<double> convolve(const FiniteGroup& G, const vector<double>& f, const vector<double>& k){
24 int n = G.elems.size(); vector<double> out(n,0.0);
25 for(int g=0; g<n; g++){
26 int ginv = G.inv[g]; double acc=0.0; for(int h=0; h<n; h++){ int idx = G.mul[ginv][h]; acc += f[h]*k[idx]; } out[g]=acc; }
27 return out;
28}
29
30vector<double> delta_identity(const FiniteGroup& G){ vector<double> d(G.elems.size(),0.0); d[0]=1.0; return d; }
31
32// Fast exponentiation under convolution: returns p^{(*t)}
33vector<double> conv_power(const FiniteGroup& G, vector<double> p, long long t){
34 vector<double> res = delta_identity(G); // identity for convolution
35 while(t>0){
36 if (t & 1LL) res = convolve(G, res, p);
37 t >>= 1LL;
38 if (t>0) p = convolve(G, p, p);
39 }
40 return res;
41}
42
43int main(){
44 FiniteGroup G = FiniteGroup::S3();
45 int n = G.elems.size();
46
47 // Step distribution: with probability 1/2 apply s=(0 1), with 1/2 apply t=(1 2)
48 vector<double> step(n, 0.0);
49 // Identify indices of s and t
50 Perm s{1,0,2}, t{0,2,1};
51 auto idx = [&](const Perm& p){ return G.id_of[FiniteGroup::key(p)]; };
52 step[idx(s)] = 0.5; step[idx(t)] = 0.5;
53
54 long long T = 5; // number of steps
55 vector<double> dist = conv_power(G, step, T);
56
57 cout.setf(ios::fixed); cout<<setprecision(6);
58 cout << "Distribution after "<<T<<" steps on S3 (sum should be 1):\n";
59 double sum=0.0; for(double x: dist){ cout<<x<<' '; sum+=x; } cout<<"\nTotal= "<<sum<<"\n";
60 return 0;
61}
62

Models a random walk on S3 where each step applies one of two transpositions with equal probability. Convolution powers p^{(*t)} give the t-step distribution. Exponentiation by squaring reduces the number of convolutions from O(t) to O(log t).

Time: O(|G|^2 \log t)Space: O(|G|^2) for the multiplication table, O(|G|) for distributions
#group convolution#finite group#circular convolution#cyclic group#s3 permutations#fft on groups#representation theory#equivariance#cayley table#random walk#convolution power#character transform#non-abelian convolution