🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
∑MathIntermediate

Vectors & Vector Spaces

Key Points

  • •
    A vector is an element you can add and scale, and a vector space is any collection of such elements closed under these operations.
  • •
    Coordinates are just recipes for building vectors from a chosen basis; change the basis and the recipe (coordinates) change too.
  • •
    Linear combinations, span, basis, and dimension are the core ideas that organize any vector space.
  • •
    Dot products, norms, and projections turn geometric intuition (lengths and angles) into algebra you can compute.
  • •
    Gaussian elimination decides linear independence and spans, and it is the engine behind solving Ax = b.
  • •
    Orthonormal bases (e.g., from Gram–Schmidt) make projections and coordinate calculations numerically stable and simple.
  • •
    In C++, vectors can be represented as std::vector<double> with operations like addition, scalar multiplication, and dot product.
  • •
    Be careful with floating-point precision, dimension mismatches, and assuming Rn rules apply to all vector spaces.

Prerequisites

  • →Fields and real numbers — Scalar multiplication happens in a field; understanding properties of \mathbb{R} (or \mathbb{C}) underlies vector space behavior.
  • →Basic algebra and equations — Linear combinations and solving systems rely on algebraic manipulation.
  • →Matrices and linear systems — Gaussian elimination, span checks, and coordinates are implemented via matrices.
  • →Inner products and norms — Projections, orthogonality, and distances depend on these notions.
  • →C++ vectors and loops — Implementing operations requires dynamic arrays, iteration, and basic error handling.
  • →Floating-point arithmetic — Numerical stability, tolerances (eps), and conditioning affect practical implementations.

Detailed Explanation

Tap terms for definitions

01Overview

Hook: Think of vectors as LEGO pieces that you can snap together and resize. A vector space is the entire box of compatible pieces that follow the same building rules. Concept: A vector is something you can add to another vector and multiply by a scalar (a number from some field, like real numbers). A vector space is a set of such vectors where addition and scalar multiplication behave predictably (associativity, commutativity, distributivity, etc.). Amazingly, these rules apply not only to arrows in the plane, but also to polynomials, functions, signals, and more. Example: In R^3, the vectors (1, 2, 3) and (4, −1, 0.5) add to (5, 1, 3.5). Scaling (1, 2, 3) by 3 gives (3, 6, 9). The set of all linear combinations a(1, 0, 0) + b(0, 1, 0) + c(0, 0, 1) is the whole R^3; the three standard basis vectors span the space. The heart of vector spaces is how we describe and combine vectors: linear combinations form spans, minimal spanning sets are bases, and the number of basis vectors is the dimension. With an inner product (like the dot product), we also measure lengths and angles, enabling projections and orthogonality. Computationally, Gaussian elimination and Gram–Schmidt translate theory into concrete procedures for questions like “Is this vector in the span?” or “What are the coordinates in this basis?”

02Intuition & Analogies

Imagine vectors as arrows on a map: you can walk one arrow, then another—this is vector addition. You can also scale an arrow, like walking twice as far in the same direction—this is scalar multiplication. A vector space is any world where these two actions make sense and obey consistent rules. Another analogy: a smoothie bar. Each fruit (banana, strawberry, mango) is a basis ingredient. A smoothie (vector) is a recipe: how many bananas, strawberries, and mangoes you blend. Any smoothie is a linear combination of basis ingredients. If you add two smoothies, you add their recipes; if you double a recipe, you double every ingredient. Changing basis is like switching your menu of ingredients. You still make the same smoothies, but the recipe card changes because you rewrite each smoothie using the new ingredients. An orthonormal basis is a set of ingredients that don’t overlap in flavor (orthogonal) and each has standard strength (unit length), making recipes easy to read and compare. Projection is like asking: if your smoothie includes some off-menu flavors (noise), what is the closest smoothie you could make using only the menu ingredients (a subspace)? You keep only the parts aligned with your allowed directions and discard the rest. Computation-wise, dot products capture alignment (how much two arrows point in the same direction), norms give sizes (lengths), and Gram–Schmidt turns any reasonable set of directions into a neat, non-overlapping menu.

03Formal Definition

A vector space V over a field F (e.g., F = R or C) is a set V equipped with two operations: vector addition + : V × V → V and scalar multiplication ⋅ : F × V → V, satisfying the usual axioms: (V, +) is an abelian group (associativity, commutativity, identity 0, additive inverses), and scalar multiplication is compatible with field multiplication and addition, and distributes over vector addition and scalar addition. Formally, for all u, v, w ∈ V and a, b ∈ F: a(u + v) = au + av, (a + b)v=av+bv, a(bv) = (ab)v, 1v=v. Given vectors v1​, …, vk​ ∈ V and scalars α1​, …, αk​ ∈ F, a linear combination is α1​ v1​ + ⋯ + αk​ vk​. The span of a set S ⊆ V is the set of all finite linear combinations of elements of S. Vectors v1​, …, vk​ are linearly independent if α1​ v1​ + ⋯ + αk​ vk​=0 implies all αi​ = 0. A basis is a linearly independent set that spans V; the number of vectors in any basis is the dimension dim(V). If V is equipped with an inner product ⟨ ⋅, ⋅ ⟩ (e.g., dot product on Rn), then \∣v∥ = ⟨v,v⟩​ is a norm, orthogonality means ⟨ u, v ⟩ = 0, and projection of v onto a subspace with orthonormal basis {ui​} is ∑i​ ⟨ v, ui​ ⟩ ui​.

04When to Use

  • Modeling directions and magnitudes: physics (forces, velocities), computer graphics (positions, normals), and robotics (joint configurations) naturally live in vector spaces.
  • Data representation: feature vectors in machine learning, word embeddings in NLP, and pixel intensities in images are vectors in high-dimensional spaces.
  • Solving systems: linear systems Ax = b are statements about linear combinations of columns of A producing b; if b is in the span of the columns, a solution exists.
  • Signal processing and statistics: projections onto subspaces implement least-squares approximation; principal component analysis relies on bases that capture variance.
  • Geometry and optimization: norms measure size; dot products measure angle; projections and decompositions simplify constraints and compute nearest points. Concrete triggers: you need to check linear independence, find/verify a basis, express a vector in new coordinates, compute distances/angles, or project onto a constraint space. In C++, you’ll implement operations like addition, scalar multiplication, dot products, and algorithms like Gaussian elimination or Gram–Schmidt to carry out these tasks efficiently and robustly.

⚠️Common Mistakes

  • Confusing coordinates with vectors: coordinates depend on the chosen basis; the vector itself does not. Always specify the basis when giving coordinates.
  • Assuming every vector space is \mathbb{R}^n with the standard dot product: function spaces, polynomial spaces, and finite fields behave similarly but may lack geometric intuition. Clarify the field and inner product (if any).
  • Treating any spanning set as a basis: spanning sets can be redundant. A basis must also be linearly independent.
  • Ignoring dimension and shape in code: adding vectors of different lengths or mismatching matrix dimensions leads to runtime errors. Validate dimensions before operations.
  • Floating-point pitfalls: exact independence in theory becomes near-dependence numerically. Use tolerances (epsilons), prefer orthonormalization for projections, and be wary of subtractive cancellation.
  • Mixing subspaces with affine spaces: a subspace must contain the zero vector and be closed under linear combinations; a shifted plane in \mathbb{R}^3 is affine, not a subspace.
  • Overreliance on normal equations for projections without checking conditioning: A^T A can be ill-conditioned. Orthonormal bases or QR factorizations are more stable.

Key Formulas

Span

span(S)={i=1∑k​αi​vi​​k∈N, vi​∈S, αi​∈F}

Explanation: The span of a set S is the set of all finite linear combinations of elements of S. It captures everything you can build from S using addition and scalar multiplication.

Linear Independence

Vectors v1​,…,vk​ are independent ⟺i=1∑k​αi​vi​=0⇒α1​=⋯=αk​=0

Explanation: A collection is independent if the only way to get the zero vector as a linear combination is to choose all coefficients zero. This prevents redundancy in a basis.

Dimension

dim(V)=k⟺∃ a basis {b1​,…,bk​}⊂V

Explanation: The dimension of a vector space V is the number of vectors in any basis for V. All bases of the same space have the same size.

Dot Product

⟨x,y⟩=xTy=i=1∑n​xi​yi​

Explanation: The inner product in Rn is the sum of pairwise products of coordinates. It measures alignment between vectors and enables angles and projections.

Euclidean Norm

∥x∥2​=⟨x,x⟩​=i=1∑n​xi2​​

Explanation: The length of a vector is the square root of the sum of squared components. It quantifies magnitude and distance.

Projection onto a Line

proju​(v)=⟨u,u⟩⟨v,u⟩​u

Explanation: The projection of v onto the direction u is the component of v aligned with u. It rescales u by how much v points along u.

Projection onto a Subspace

projW​(v)=i=1∑k​⟨v,ui​⟩ui​for orthonormal {ui​}

Explanation: If you have an orthonormal basis of a subspace W, project v by summing its components along each basis vector. Orthonormality makes coefficients simple inner products.

Pseudoinverse (Full Column Rank)

A+=(ATA)−1AT(if columns are independent)

Explanation: When A has independent columns, the pseudoinverse maps b to the least-squares solution x=A+ b. It is related to the projection matrix onto the column space.

Projection Matrix

P=A(ATA)−1AT

Explanation: This matrix projects any vector onto the column space of A when columns are independent. Applying P to b yields its projection onto span(A).

Rank–Nullity Theorem

dim(kerA)+rank(A)=n

Explanation: For an m × n matrix A, the dimension of the null space plus the rank equals the number of columns. It partitions degrees of freedom into constrained and free parts.

Gaussian Elimination Cost

T(n)=32​n3+O(n2)

Explanation: Row-reducing a dense n × n matrix takes on the order of n3 operations. This sets expectations for runtime when solving large systems.

Complexity Analysis

Basic vector operations with n components are linear-time. Adding two vectors or scaling by a scalar requires O(n) arithmetic and O(n) space if you create a new vector; in-place operations reduce space to O(1) extra. The dot product and Euclidean norm also run in O(n), scanning components once. Testing span membership or independence using Gaussian elimination depends on matrix shape. For a d × k matrix of k candidate basis vectors in d-dimensional space, elimination takes roughly O(d k2) when d≥k (cost dominated by eliminating below pivots across k columns), or more generally O(min(d,k)·d·k). Memory usage is O(dk) to store the matrix plus O(d + k) for indices and temporary buffers. Solving a square n × n system is O(n3) time and O(n2) space for the matrix. Orthonormalizing k vectors in Rd via classical Gram–Schmidt costs O(k2 d) time: for each new vector you subtract projections onto all previous ones (each projection is O(d)). Storing the orthonormal basis costs O(k d) space. Modified Gram–Schmidt has the same asymptotic cost but better numerical stability. Projection onto a subspace with an orthonormal basis costs O(k d) per vector, since it requires k dot products and k scaled vector additions. If you instead form normal equations AT A, the one-time cost to construct and factor them is O(d k2 + k3), and each subsequent projection is O(k d), but conditioning may suffer. Overall, for moderate k and d typical in practice, these algorithms are dominated by quadratic or cubic dependence on k when building bases and linear dependence on d when applying them.

Code Examples

A minimal Vector class: addition, scalar multiplication, dot product, norm, and projection onto a vector
1#include <iostream>
2#include <vector>
3#include <cmath>
4#include <stdexcept>
5#include <iomanip>
6
7struct Vec {
8 std::vector<double> a; // components
9
10 explicit Vec(size_t n = 0, double val = 0.0) : a(n, val) {}
11 explicit Vec(const std::vector<double>& v) : a(v) {}
12
13 size_t size() const { return a.size(); }
14
15 // Element access
16 double& operator[](size_t i) { return a[i]; }
17 const double& operator[](size_t i) const { return a[i]; }
18
19 // Addition
20 Vec operator+(const Vec& other) const {
21 if (size() != other.size()) throw std::invalid_argument("dimension mismatch");
22 Vec res(size());
23 for (size_t i = 0; i < size(); ++i) res[i] = a[i] + other[i];
24 return res;
25 }
26
27 // Subtraction
28 Vec operator-(const Vec& other) const {
29 if (size() != other.size()) throw std::invalid_argument("dimension mismatch");
30 Vec res(size());
31 for (size_t i = 0; i < size(); ++i) res[i] = a[i] - other[i];
32 return res;
33 }
34
35 // Scalar multiplication
36 Vec operator*(double s) const {
37 Vec res(size());
38 for (size_t i = 0; i < size(); ++i) res[i] = a[i] * s;
39 return res;
40 }
41
42 // Dot product
43 double dot(const Vec& other) const {
44 if (size() != other.size()) throw std::invalid_argument("dimension mismatch");
45 double sum = 0.0;
46 for (size_t i = 0; i < size(); ++i) sum += a[i] * other[i];
47 return sum;
48 }
49
50 // Euclidean norm
51 double norm() const {
52 return std::sqrt(this->dot(*this));
53 }
54
55 // Projection of this vector onto direction u: proj_u(this)
56 Vec proj_onto(const Vec& u, double eps = 1e-12) const {
57 double denom = u.dot(u);
58 if (denom < eps) throw std::invalid_argument("cannot project onto near-zero vector");
59 double coeff = this->dot(u) / denom;
60 return u * coeff;
61 }
62};
63
64std::ostream& operator<<(std::ostream& os, const Vec& v) {
65 os << std::fixed << std::setprecision(4) << "[";
66 for (size_t i = 0; i < v.size(); ++i) {
67 os << v[i];
68 if (i + 1 != v.size()) os << ", ";
69 }
70 os << "]";
71 return os;
72}
73
74int main() {
75 Vec v({1.0, 2.0, 3.0});
76 Vec w({4.0, -1.0, 0.5});
77
78 Vec sum = v + w; // addition
79 Vec diff = v - w; // subtraction
80 Vec scaled = v * 3.0; // scalar multiplication
81 double dp = v.dot(w); // dot product
82 double nv = v.norm(); // norm
83 Vec proj = v.proj_onto(w); // projection of v onto w
84
85 std::cout << "v = " << v << "\n";
86 std::cout << "w = " << w << "\n";
87 std::cout << "v + w = " << sum << "\n";
88 std::cout << "v - w = " << diff << "\n";
89 std::cout << "3 * v = " << scaled << "\n";
90 std::cout << "v·w = " << dp << "\n";
91 std::cout << "||v|| = " << nv << "\n";
92 std::cout << "proj_w(v)= " << proj << "\n";
93 return 0;
94}
95

This program defines a lightweight vector type supporting addition, scalar multiplication, dot product, Euclidean norm, and projection onto a given direction. It demonstrates that vector addition and scaling follow component-wise rules and uses the dot product to compute lengths and projections.

Time: O(n) per vector operation (addition, scaling, dot, norm, projection) for dimension n.Space: O(n) to store the vector; each operation returns a new O(n) vector (in-place variants could reduce extra space to O(1)).
Gaussian elimination to check if a vector lies in the span of given vectors (and to get coefficients)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Solve A x = b using Gaussian elimination with partial pivoting.
5// A: d x k matrix (stored as vector<vector<double>> with d rows, k cols)
6// b: size d
7// Returns pair<has_solution, coefficients x of size k> (one solution if many)
8pair<bool, vector<double>> solve_linear_system(vector<vector<double>> A, vector<double> b, double eps = 1e-10) {
9 int d = (int)A.size();
10 if (d == 0) return {true, {}};
11 int k = (int)A[0].size();
12 for (int i = 0; i < d; ++i) if ((int)A[i].size() != k) throw invalid_argument("inconsistent row sizes");
13 if ((int)b.size() != d) throw invalid_argument("dimension mismatch");
14
15 // Build augmented matrix [A | b]
16 vector<vector<double>> M(d, vector<double>(k + 1));
17 for (int i = 0; i < d; ++i) {
18 for (int j = 0; j < k; ++j) M[i][j] = A[i][j];
19 M[i][k] = b[i];
20 }
21
22 int row = 0, col = 0;
23 vector<int> pivot_col; pivot_col.reserve(min(d, k));
24
25 // Forward elimination to row-echelon form
26 while (row < d && col < k) {
27 // Choose pivot: row with largest |M[r][col]|
28 int piv = row;
29 for (int r = row + 1; r < d; ++r) if (fabs(M[r][col]) > fabs(M[piv][col])) piv = r;
30 if (fabs(M[piv][col]) < eps) { // no pivot in this column
31 ++col;
32 continue;
33 }
34 swap(M[piv], M[row]);
35 // Normalize pivot row
36 double pivval = M[row][col];
37 for (int j = col; j <= k; ++j) M[row][j] /= pivval;
38 // Eliminate below
39 for (int r = row + 1; r < d; ++r) {
40 double factor = M[r][col];
41 if (fabs(factor) < eps) continue;
42 for (int j = col; j <= k; ++j) M[r][j] -= factor * M[row][j];
43 }
44 pivot_col.push_back(col);
45 ++row; ++col;
46 }
47
48 // Check for inconsistency: a row of zeros in A but nonzero in b
49 for (int r = row; r < d; ++r) {
50 double sumA = 0.0;
51 for (int j = 0; j < k; ++j) sumA += fabs(M[r][j]);
52 if (sumA < eps && fabs(M[r][k]) > eps) return {false, {}}; // 0 = nonzero
53 }
54
55 // Back substitution to get one solution (set free vars = 0)
56 vector<double> x(k, 0.0);
57 for (int i = (int)pivot_col.size() - 1; i >= 0; --i) {
58 int pc = pivot_col[i]; // column of pivot in row i
59 double rhs = M[i][k];
60 for (int j = pc + 1; j < k; ++j) rhs -= M[i][j] * x[j];
61 x[pc] = rhs; // since pivot is 1 and others above are eliminated next
62 }
63
64 return {true, x};
65}
66
67// Convenience: check if v is in span of 'basis' columns, and retrieve coefficients
68pair<bool, vector<double>> is_in_span(const vector<vector<double>>& basis, const vector<double>& v, double eps = 1e-10) {
69 // basis is a list of vectors in R^d; form A with these as columns
70 if (basis.empty()) return {all_of(v.begin(), v.end(), [](double x){ return fabs(x) < 1e-10; }), {}};
71 int k = (int)basis.size();
72 int d = (int)basis[0].size();
73 for (const auto& b : basis) if ((int)b.size() != d) throw invalid_argument("basis vectors must have same dimension");
74 if ((int)v.size() != d) throw invalid_argument("vector dimension mismatch");
75
76 vector<vector<double>> A(d, vector<double>(k));
77 for (int j = 0; j < k; ++j) for (int i = 0; i < d; ++i) A[i][j] = basis[j][i];
78
79 return solve_linear_system(A, v, eps);
80}
81
82int main() {
83 // Basis for the xy-plane in R^3
84 vector<vector<double>> basis = { {1, 0, 0}, {0, 1, 0} };
85 vector<double> v1 = {2, -3, 0}; // in span
86 vector<double> v2 = {1, 1, 1}; // not in span
87
88 auto r1 = is_in_span(basis, v1);
89 auto r2 = is_in_span(basis, v2);
90
91 cout << boolalpha;
92 cout << "v1 in span? " << r1.first << "\n";
93 if (r1.first) {
94 cout << "coeffs for v1: [";
95 for (size_t i = 0; i < r1.second.size(); ++i) cout << r1.second[i] << (i+1==r1.second.size()?"":", ");
96 cout << "]\n";
97 }
98
99 cout << "v2 in span? " << r2.first << "\n";
100 if (r2.first) {
101 cout << "coeffs for v2: [";
102 for (size_t i = 0; i < r2.second.size(); ++i) cout << r2.second[i] << (i+1==r2.second.size()?"":", ");
103 cout << "]\n";
104 } else {
105 cout << "(no exact solution; z-component prevents membership)\n";
106 }
107 return 0;
108}
109

We form a matrix A whose columns are the candidate spanning vectors and solve Ax = v. If a solution x exists, v is in the span and x gives the coordinates of v in that basis. Gaussian elimination with partial pivoting detects inconsistencies robustly and provides one solution when many exist.

Time: O(d · k^2) for d-dimensional vectors and k basis vectors (assuming d ≥ k).Space: O(dk) to store A (plus O(d + k) temporaries).
Gram–Schmidt orthonormalization and projection onto a subspace
1#include <iostream>
2#include <vector>
3#include <cmath>
4#include <algorithm>
5#include <iomanip>
6
7using namespace std;
8
9static double dot(const vector<double>& a, const vector<double>& b) {
10 if (a.size() != b.size()) throw invalid_argument("dimension mismatch");
11 double s = 0.0; for (size_t i = 0; i < a.size(); ++i) s += a[i] * b[i];
12 return s;
13}
14
15static double norm(const vector<double>& a) {
16 return sqrt(dot(a, a));
17}
18
19static vector<double> add(const vector<double>& a, const vector<double>& b) {
20 vector<double> r(a.size());
21 for (size_t i = 0; i < a.size(); ++i) r[i] = a[i] + b[i];
22 return r;
23}
24
25static vector<double> sub(const vector<double>& a, const vector<double>& b) {
26 vector<double> r(a.size());
27 for (size_t i = 0; i < a.size(); ++i) r[i] = a[i] - b[i];
28 return r;
29}
30
31static vector<double> scal(double s, const vector<double>& a) {
32 vector<double> r(a.size());
33 for (size_t i = 0; i < a.size(); ++i) r[i] = s * a[i];
34 return r;
35}
36
37// Modified Gram–Schmidt for numerical stability
38vector<vector<double>> gram_schmidt_orthonormal(vector<vector<double>> vecs, double eps = 1e-12) {
39 if (vecs.empty()) return {};
40 size_t d = vecs[0].size();
41 for (const auto& v : vecs) if (v.size() != d) throw invalid_argument("dimension mismatch");
42
43 vector<vector<double>> U; U.reserve(vecs.size());
44 for (auto v : vecs) {
45 for (auto& u : U) {
46 double c = dot(v, u); // coefficient along u
47 v = sub(v, scal(c, u)); // remove component along u
48 }
49 double nv = norm(v);
50 if (nv > eps) {
51 for (auto& x : v) x /= nv; // normalize to unit length
52 U.push_back(v);
53 }
54 // else: v was (nearly) dependent; skip it
55 }
56 return U; // orthonormal set spanning the same subspace
57}
58
59vector<double> project_onto_subspace(const vector<double>& v, const vector<vector<double>>& U) {
60 if (U.empty()) return vector<double>(v.size(), 0.0);
61 vector<double> p(v.size(), 0.0);
62 for (const auto& u : U) {
63 double c = dot(v, u); // coefficient along orthonormal direction
64 p = add(p, scal(c, u));
65 }
66 return p;
67}
68
69int main() {
70 // Subspace spanned by two vectors in R^3
71 vector<vector<double>> basis = { {1, 1, 0}, {1, 0, 1} };
72 vector<double> v = {2, 1, 3};
73
74 auto U = gram_schmidt_orthonormal(basis);
75
76 cout << fixed << setprecision(6);
77 cout << "Orthonormal basis U (rows):\n";
78 for (auto& u : U) {
79 cout << "[";
80 for (size_t i = 0; i < u.size(); ++i) cout << u[i] << (i+1==u.size()?"":", ");
81 cout << "]\n";
82 }
83
84 vector<double> p = project_onto_subspace(v, U);
85 cout << "v = [" << v[0] << ", " << v[1] << ", " << v[2] << "]\n";
86 cout << "Projection p = [" << p[0] << ", " << p[1] << ", " << p[2] << "]\n";
87 cout << "Residual r = v - p = [" << v[0]-p[0] << ", " << v[1]-p[1] << ", " << v[2]-p[2] << "]\n";
88 cout << "||r|| = " << norm(vector<double>{v[0]-p[0], v[1]-p[1], v[2]-p[2]}) << "\n";
89
90 return 0;
91}
92

We use modified Gram–Schmidt to convert a spanning set into an orthonormal basis. Projection onto the subspace then becomes a sum of inner products times basis vectors, which is simple and numerically stable. The residual v − p is orthogonal to the subspace and has minimal norm among all approximations from the subspace.

Time: O(k^2 · d) to orthonormalize k vectors in dimension d; projecting a single vector is O(k · d).Space: O(k · d) to store the orthonormal basis; O(d) extra for temporaries.
#vector space#basis#span#linear independence#dimension#dot product#norm#projection#gram-schmidt#gaussian elimination#orthonormal#rank#null space#linear combination#change of basis