⚙️AlgorithmIntermediate

Matrix Factorizations (Numerical)

Key Points

•
Matrix factorizations rewrite a matrix into simpler building blocks (triangular or orthogonal) that make solving and analyzing linear systems much easier.
•
LU with partial pivoting is the standard tool for solving square systems efficiently and stably by reducing A to lower and upper triangular matrices.
•
QR (preferably via Householder reflections) is the robust method for least-squares problems and for obtaining orthonormal bases.
•
Cholesky is the fastest and most memory-friendly factorization for symmetric positive definite (SPD) matrices.
•
Numerical stability matters: use partial pivoting for LU, Householder (or modified Gram–Schmidt) for QR, and check SPD for Cholesky.
•
Operation counts are cubic in the matrix dimension for dense problems: roughly O( $n^{3}$ ) time and O( $n^{2}$ ) space.
•
Determinants, ranks, and condition indicators can be extracted cheaply from factorizations (e.g., det(A) via LU diagonals).
•
In practice, prefer library routines (BLAS/LAPACK/Eigen); the provided C++ code demonstrates the algorithms and their pitfalls on small inputs.

Prerequisites

→Gaussian elimination — Foundation for LU factorization and understanding pivoting and triangular solves.
→Vector and matrix norms — Needed to discuss orthogonality, conditioning, and least-squares residuals.
→Dot products and projections — Core to Gram–Schmidt and understanding Householder reflections.
→Properties of SPD matrices — Required to know when Cholesky applies and why it is stable and unique.
→Floating-point arithmetic and rounding error — Essential for understanding numerical stability and algorithm choices.
→Big-O notation and flop counts — To analyze algorithmic time and space complexity.
→Triangular systems (forward/back substitution) — Used after LU, QR, and Cholesky to complete solves.
→Basic C++ (vectors, loops, exceptions) — To read, write, and modify the provided implementations safely.
→Matrix multiplication basics — Helps understand how factorizations reconstruct A and how reflectors act on matrices.
→Least-squares problem formulation — Necessary context for when to choose QR and how to interpret solutions.

Detailed Explanation

Tap terms for definitions

01Overview

Matrix factorizations are techniques that decompose a matrix into a product of simpler matrices that are easier to work with. Instead of directly manipulating a complicated matrix A, we rewrite it as a product such as A = LU (lower times upper triangular), A = QR (orthogonal times upper triangular), or A = L $L^{T}$ (Cholesky for symmetric positive definite matrices). These forms let us solve linear systems, compute least-squares solutions, estimate determinants, and analyze numerical stability using simpler building blocks. For example, triangular systems can be solved quickly by forward or backward substitution, and orthogonal matrices preserve lengths, which improves numerical robustness. In numerical linear algebra, these factorizations are the workhorses behind many algorithms in scientific computing, machine learning, and data analysis. They convert expensive, error-prone manipulations into structured computations with well-understood costs and stability properties. While all three factorizations have O( $n^3$ ) time complexity for dense $n×n$ matrices, they differ in constants, stability, and applicability: LU with pivoting for general square matrices, QR (preferably Householder) for least squares or when orthogonality is advantageous, and Cholesky for the special (but common) SPD case that yields the fastest, most stable solve.

02Intuition & Analogies

Think of a messy tool cabinet (your matrix A) that makes it hard to find and use tools (solutions). Factoring the matrix is like reorganizing the cabinet into drawers by type: one drawer for screwdrivers (lower triangular L), another for wrenches (upper triangular U), or perhaps grouping tools by size and orientation (orthogonal Q and triangular R). Once organized, you can quickly pull the right tool to solve a problem: triangular systems are easy to handle step by step, and orthogonal matrices don’t distort distances, so computations remain well-behaved.

LU: Imagine stacking blocks to build a wall. Lower-triangular L tells you how much of each “lower-level” block supports the current row, while U stores the resulting top edges you build on. Partial pivoting is like choosing the sturdiest block to stand on to avoid wobble (numerical instability).
QR: Picture aligning a set of arrows (vectors) so that each new arrow is made perpendicular to the previously aligned ones and then scaled. Q stores the orthonormal directions (nice, perpendicular axes), and R stores the lengths and how to combine them. Reflecting a vector across a plane (Householder reflection) is a stable way to realign arrows without accumulating much error.
Cholesky: If your cabinet is perfectly symmetric and sturdy (SPD), you can halve the work: the matrix equals L times its own transpose. It’s like using mirrored drawers—organize once, and the other side follows automatically—making the process faster and more memory efficient.

03Formal Definition

Given a real matrix A

\in

R^{m \times n}

: - LU factorization (with partial pivoting) finds a permutation matrix P, a unit lower-triangular matrix L, and an upper-triangular matrix U such that P

A = LU

. This exists for any nonsingular square matrix with pivoting and is obtained via Gaussian elimination. - QR factorization writes

A = QR

where Q

\in

R^{m \times m}

is orthogonal (

Q^{T} Q = I

) and R

\in

R^{m \times n}

is upper-triangular (often we refer to the leading n

\times

n upper-triangular block when m

\geq

n). For full-column-rank A (m

\geq

n), the least-squares solution minimizes \

∣ A x - b ∥

_2 and satisfies

R x = Q^{T} b

(using the top n components). - Cholesky factorization applies to symmetric positive definite (SPD) A

\in

R^{n \times n}

. It finds a lower-triangular L with positive diagonal such that

A = L L^{T}

. This is unique and can be computed by a stable algorithm with about half the work of LU for dense A. These factorizations exploit structure: triangular matrices enable O(

n^{2}

) solves, and orthogonality preserves the 2-norm, aiding numerical stability. Their existence and uniqueness depend on properties like nonsingularity, rank, and positive definiteness.

04When to Use

LU with partial pivoting (PA = LU): Use for solving many linear systems Ax = b with the same square A and different right-hand sides b. It’s the default for general dense square matrices. Also useful for computing $\operatorname{det}(A)$ (product of U’s diagonal times pivot sign) and for inverting A in a structured way (though direct inversion is usually discouraged).
QR (A = QR): Prefer for overdetermined least-squares problems (m > n), where you want a numerically stable solution that avoids forming $A^{T}$ A. Also use when you need an orthonormal basis for the column space, to estimate rank, or in algorithms like eigenvalue methods (e.g., the QR algorithm for eigenvalues).
Cholesky (A = L $L^{T}$ ): Choose when A is symmetric positive definite—common in covariance matrices, kernel matrices, and normal-equation matrices from least squares. It’s the fastest factorization for such A, with fewer flops and great stability. Use it to solve Ax = b (two triangular solves), compute log-determinants, and sample from Gaussian distributions. In practice: if you know A is SPD, use Cholesky; if you have general square A, use LU with pivoting; if you face least-squares or want orthogonality, use QR (preferably Householder).

⚠️Common Mistakes

Skipping pivoting in LU: Without partial pivoting, Gaussian elimination can be numerically unstable or even fail due to tiny pivots. Always use partial pivoting for general matrices.
Using classical Gram–Schmidt for QR: It suffers loss of orthogonality in finite precision. Prefer Householder reflections (most stable) or modified Gram–Schmidt for teaching or moderate accuracy needs.
Applying Cholesky to non-SPD matrices: This leads to taking square roots of negative numbers or division by zero. Always check symmetry (A = $A^{T}$ within a tolerance) and positive definiteness (leading principal minors positive, or attempt the factorization and detect breakdown).
Forming normal equations $A^{T}$ A x = $A^{T}$ b when a QR solve is available: This squares the condition number and can dramatically increase error. Use QR directly for least squares.
Ignoring scaling/conditioning: Poorly scaled matrices amplify rounding errors. Consider scaling rows/columns or using algorithms that reveal conditioning (e.g., rank-revealing QR).
Treating determinants as a stability check: A tiny determinant does not alone diagnose singularity; look at condition number or pivot sizes. Also, avoid computing $\operatorname{det}(A)$ via cofactors—use LU.
Recomputing factorizations unnecessarily: If A is fixed and multiple right-hand sides are solved, reuse the factorization to save time.

Key Formulas

LU with Partial Pivoting

P A = LU

Explanation: Any nonsingular square matrix A can be factored into a permutation P, unit lower-triangular L, and upper-triangular U. This enables fast triangular solves and stable Gaussian elimination.

QR Factorization

A = QR, Q^{T} Q = I

Explanation: A is written as an orthogonal matrix Q times an upper-triangular matrix R. Orthogonality preserves lengths and makes least-squares solves stable.

Cholesky Factorization

A = L L^{T}

Explanation: For symmetric positive definite A, there exists a unique lower-triangular L with positive diagonal such that $A = L L^{T}$ . This halves the work compared to LU for dense matrices.

Determinant from LU

det (A) = sgn (P) i = 1 \prod n U_{ii}

Explanation: The determinant equals the product of diagonal entries of U times the sign of the permutation. It can be computed in O(n) after LU is available.

Least Squares via QR

x^{*} = ar g x min ∥ A x - b ∥_{2}, R x = Q^{T} b [1 : n]

Explanation: For full-column-rank A, the minimizer is obtained by computing $y = Q^{T} b$ and solving the top n×n triangular system $R x = y$ . This avoids forming $A^{T}$ A.

2-Norm Condition Number

κ_{2} (A) = ∥ A ∥_{2} \cdot ∥ A^{- 1} ∥_{2}

Explanation: This measures sensitivity of solutions to perturbations in A or b. A large condition number means small errors in data can cause large errors in the solution.

Norm Preservation by Orthogonal Q

∥ Q x ∥_{2} = ∥ x ∥_{2}

Explanation: Orthogonal transformations do not change Euclidean lengths, which helps maintain numerical stability in QR algorithms.

Dense Flop Counts

flops (LU) \approx \frac{2}{3} n^{3}, flops (QR-Householder) \approx \frac{4}{3} n^{3}, flops (Cholesky) \approx \frac{1}{3} n^{3}

Explanation: For large dense n×n matrices, cubic time dominates. Cholesky is fastest, LU is intermediate, and QR via Householder is costliest among the three.

Normal Equations

A^{T} A x = A^{T} b

Explanation: This characterizes least-squares solutions but squares the condition number and is less stable. Prefer QR-based methods over forming $A^{T}$ A explicitly.

Residual Norm

r = b - A x, ∥ r ∥_{2}

Explanation: The residual measures how closely x satisfies Ax = b (or fits data). A small residual indicates a good fit, though the solution may still be sensitive if A is ill-conditioned.

Growth Factor in Gaussian Elimination

ρ = \frac{max _{i, j} ∣ U _{ij} ∣}{max _{i, j} ∣ A _{ij} ∣}

Explanation: This quantifies element growth during elimination. Large growth can cause numerical issues; partial pivoting controls growth in most practical cases.

Complexity Analysis

For dense n×n matrices, all three factorizations have cubic time complexity in the dominant term and quadratic space complexity. LU with partial pivoting performs about

\frac{2}{3} n^{3}

floating-point operations (flops) to factor A and O(

n^{2}

) memory to store L and U (often in-place). Solving

A x = b

using LU then costs two triangular solves, O(

n^{2}

) time per right-hand side. QR via Householder reflections requires roughly

\frac{4}{3} n^{3}

flops for square matrices; for m×n with

m \geq n

, it costs about 2m

n^{2}

−

\frac{2}{3} n^{3}

. It uses O(mn) space, and explicitly forming Q increases both time and memory by constant factors. Cholesky on SPD matrices is the cheapest at about

\frac{1}{3} n^{3}

flops with O(

n^{2}

) space, thanks to symmetry and halved work. Asymptotically, all are O(

n^{3}

), but constants matter:

C h o l es k y < LU < QR

(Householder). For multiple right-hand sides, factor once (O(

n^{3}

)) and reuse with O(

n^{2}

) per solve. Numerical stability also affects effective performance; unstable methods may require refinements or fail. Cache-friendly (blocked) implementations and BLAS-3 kernels dramatically improve real runtime on modern hardware compared to naive loops; the provided C++ code is pedagogical rather than high-performance. For sparse matrices, complexities depend on sparsity patterns and fill-in; specialized sparse factorizations can be much faster and more memory efficient than dense O(

n^{3}

)/O(

n^{2}

) bounds.

Code Examples

LU Decomposition with Partial Pivoting and Solve (PA = LU)

1 #include <iostream>
2 #include <vector>
3 #include <cmath>
4 #include <iomanip>
5 #include <stdexcept>
6 #include <limits>
7 #include <algorithm>
8 
9 using Matrix = std::vector<std::vector<double>>;
10 
11 struct LUResult {
12     Matrix LU;              // Combined L (unit lower) and U (upper)
13     std::vector<int> P;     // Row permutation: Pb = b[P[i]]
14     int sign;               // Sign of the permutation (+1 or -1)
15 };
16 
17 // Perform in-place LU with partial pivoting on a copy of A
18 LUResult lu_decompose(const Matrix& A) {
19     int n = (int)A.size();
20     if (n == 0 || (int)A[0].size() != n) throw std::invalid_argument("A must be square");
21     Matrix LU = A; // will hold both L and U
22     std::vector<int> P(n);
23     for (int i = 0; i < n; ++i) P[i] = i;
24     int sign = 1;
25     const double eps = 1e-12;
26 
27     for (int k = 0; k < n; ++k) {
28         // Pivot: find row with max |LU[i][k]| for i >= k
29         int piv = k;
30         double maxabs = std::abs(LU[k][k]);
31         for (int i = k + 1; i < n; ++i) {
32             double val = std::abs(LU[i][k]);
33             if (val > maxabs) { maxabs = val; piv = i; }
34         }
35         if (maxabs < eps) throw std::runtime_error("Matrix is singular to working precision");
36         if (piv != k) {
37             std::swap(LU[piv], LU[k]);
38             std::swap(P[piv], P[k]);
39             sign = -sign;
40         }
41         // Elimination
42         for (int i = k + 1; i < n; ++i) {
43             LU[i][k] /= LU[k][k]; // multiplier stored in L part
44             double lik = LU[i][k];
45             for (int j = k + 1; j < n; ++j) {
46                 LU[i][j] -= lik * LU[k][j];
47             }
48         }
49     }
50     return {LU, P, sign};
51 }
52 
53 // Solve Ly = Pb (L is unit-lower inside LU)
54 std::vector<double> forward_subst(const Matrix& LU, const std::vector<int>& P, const std::vector<double>& b) {
55     int n = (int)LU.size();
56     std::vector<double> y(n);
57     for (int i = 0; i < n; ++i) {
58         double sum = b[P[i]]; // apply permutation P to b
59         for (int j = 0; j < i; ++j) sum -= LU[i][j] * y[j];
60         y[i] = sum; // L has unit diagonal
61     }
62     return y;
63 }
64 
65 // Solve Ux = y (U is upper-triangular part of LU)
66 std::vector<double> back_subst(const Matrix& LU, const std::vector<double>& y) {
67     int n = (int)LU.size();
68     std::vector<double> x(n);
69     const double eps = 1e-14;
70     for (int i = n - 1; i >= 0; --i) {
71         double sum = y[i];
72         for (int j = i + 1; j < n; ++j) sum -= LU[i][j] * x[j];
73         double uii = LU[i][i];
74         if (std::abs(uii) < eps) throw std::runtime_error("Singular U on back substitution");
75         x[i] = sum / uii;
76     }
77     return x;
78 }
79 
80 // Convenience: solve Ax = b via PA = LU
81 std::vector<double> lu_solve(const Matrix& A, const std::vector<double>& b) {
82     auto lu = lu_decompose(A);
83     auto y = forward_subst(lu.LU, lu.P, b);
84     auto x = back_subst(lu.LU, y);
85     return x;
86 }
87 
88 // Extract determinant quickly from LU
89 double lu_determinant(const LUResult& lu) {
90     double det = (double)lu.sign;
91     for (int i = 0; i < (int)lu.LU.size(); ++i) det *= lu.LU[i][i];
92     return det;
93 }
94 
95 int main() {
96     // Example: Solve Ax = b and compute det(A)
97     Matrix A = {
98         { 2.0,  1.0,  1.0},
99         { 4.0, -6.0,  0.0},
100         {-2.0,  7.0,  2.0}
101     };
102     std::vector<double> b = {5.0, -2.0, 9.0};
103 
104     try {
105         auto lu = lu_decompose(A);
106         auto y = forward_subst(lu.LU, lu.P, b);
107         auto x = back_subst(lu.LU, y);
108 
109         std::cout << std::fixed << std::setprecision(6);
110         std::cout << "Solution x:";
111         for (double xi : x) std::cout << " " << xi; std::cout << "\n";
112 
113         double detA = lu_determinant(lu);
114         std::cout << "det(A) = " << detA << "\n";
115     } catch (const std::exception& e) {
116         std::cerr << "Error: " << e.what() << "\n";
117     }
118     return 0;
119 }
120

This program computes an LU factorization with partial pivoting (PA = LU) using an in-place scheme that stores L’s multipliers below the diagonal and U on and above the diagonal. It then solves Ax = b by applying the permutation to b, performing forward substitution with unit-lower L, and back substitution with U. The determinant is the product of U’s diagonal times the permutation sign.

Time: O(n^3) for factorization; O(n^2) per solveSpace: O(n^2) to store LU and O(n) for permutation and temporary vectors

QR via Householder Reflections and Least-Squares Solve

1 #include <iostream>
2 #include <vector>
3 #include <cmath>
4 #include <iomanip>
5 #include <stdexcept>
6 
7 using Matrix = std::vector<std::vector<double>>;
8 
9 // Build an m-length vector w that embeds v starting at index k
10 std::vector<double> embed(const std::vector<double>& v, int m, int k) {
11     std::vector<double> w(m, 0.0);
12     for (int i = 0; i < (int)v.size(); ++i) w[k + i] = v[i];
13     return w;
14 }
15 
16 // Apply Householder reflector H = I - 2 w w^T to the left of A: A <- H A
17 void apply_left_reflector(Matrix& A, const std::vector<double>& w) {
18     int m = (int)A.size();
19     int n = (int)A[0].size();
20     // For each column j, col_j <- col_j - 2 * w * (w^T col_j)
21     for (int j = 0; j < n; ++j) {
22         double dot = 0.0;
23         for (int i = 0; i < m; ++i) dot += w[i] * A[i][j];
24         for (int i = 0; i < m; ++i) A[i][j] -= 2.0 * w[i] * dot;
25     }
26 }
27 
28 // Compute QR using Householder reflections. Returns Q and R explicitly.
29 void householder_qr(const Matrix& Ain, Matrix& Q, Matrix& R) {
30     int m = (int)Ain.size();
31     int n = (int)Ain[0].size();
32     R = Ain; // will become upper-triangular
33     Q.assign(m, std::vector<double>(m, 0.0));
34     for (int i = 0; i < m; ++i) Q[i][i] = 1.0; // start with identity
35 
36     int kmax = std::min(m, n);
37     for (int k = 0; k < kmax; ++k) {
38         // Form Householder vector v to zero out R[k+1..m-1][k]
39         double normx = 0.0;
40         for (int i = k; i < m; ++i) normx = std::hypot(normx, R[i][k]);
41         if (normx == 0.0) continue; // nothing to do
42         double alpha = (R[k][k] >= 0.0) ? -normx : normx; // avoid cancellation
43 
44         std::vector<double> v(m - k);
45         v[0] = R[k][k] - alpha;
46         for (int i = k + 1; i < m; ++i) v[i - k] = R[i][k];
47         // Normalize v
48         double vnorm = 0.0;
49         for (double vi : v) vnorm = std::hypot(vnorm, vi);
50         for (double &vi : v) vi /= vnorm;
51 
52         // Build full w and apply to R and accumulate Q = Q * H
53         std::vector<double> w = embed(v, m, k);
54         apply_left_reflector(R, w);
55         apply_left_reflector(Q, w);
56 
57         // Ensure exact zeros below diagonal in column k (for cleanliness)
58         for (int i = k + 1; i < m; ++i) R[i][k] = 0.0;
59     }
60     // After accumulating Q <- Q * H_k ... H_0, we have A' = Q * R (since H is symmetric),
61     // but the desired relation is A = Q R, with Q orthogonal. Our Q is indeed orthogonal.
62 }
63 
64 // Solve least-squares min_x ||Ax - b||_2 using QR: x from R1 x = Q^T b (top n entries)
65 std::vector<double> qr_least_squares(const Matrix& A, const std::vector<double>& b) {
66     int m = (int)A.size();
67     int n = (int)A[0].size();
68     if ((int)b.size() != m) throw std::invalid_argument("Size mismatch");
69     Matrix Q, R;
70     householder_qr(A, Q, R);
71     // Compute y = Q^T b
72     std::vector<double> y(m, 0.0);
73     for (int i = 0; i < m; ++i) {
74         for (int j = 0; j < m; ++j) y[i] += Q[j][i] * b[j]; // Q^T b
75     }
76     // Back-solve R1 x = y[0..n-1]
77     std::vector<double> x(n, 0.0);
78     for (int i = n - 1; i >= 0; --i) {
79         double sum = y[i];
80         for (int j = i + 1; j < n; ++j) sum -= R[i][j] * x[j];
81         if (std::abs(R[i][i]) < 1e-14) throw std::runtime_error("Rank deficiency detected");
82         x[i] = sum / R[i][i];
83     }
84     return x;
85 }
86 
87 int main() {
88     // Overdetermined example: fit y = ax + b to 3 points via least squares
89     // A = [x 1], b = y
90     Matrix A = {
91         {0.0, 1.0},
92         {1.0, 1.0},
93         {2.0, 1.0}
94     }; // m=3, n=2
95     std::vector<double> b = {1.0, 2.0, 2.0};
96 
97     try {
98         std::vector<double> x = qr_least_squares(A, b);
99         std::cout << std::fixed << std::setprecision(6);
100         std::cout << "Least-squares solution [a, b]: ";
101         for (double xi : x) std::cout << xi << ' ';
102         std::cout << "\n";
103     } catch (const std::exception& e) {
104         std::cerr << "Error: " << e.what() << "\n";
105     }
106     return 0;
107 }
108

This program computes a QR factorization using stable Householder reflections. It explicitly forms both Q and R, then solves an overdetermined least-squares problem by computing y = Q^T b and back-solving the leading triangular system. Householder reflections preserve norms and maintain numerical orthogonality better than classical Gram–Schmidt.

Time: O(mn^2) for m ≥ n (≈ O(n^3) for square)Space: O(mn) to store Q and R explicitly

Cholesky Decomposition (A = L L^T) and Solve for SPD Matrices

1 #include <iostream>
2 #include <vector>
3 #include <cmath>
4 #include <iomanip>
5 #include <stdexcept>
6 
7 using Matrix = std::vector<std::vector<double>>;
8 
9 Matrix cholesky_decompose(const Matrix& A) {
10     int n = (int)A.size();
11     if (n == 0 || (int)A[0].size() != n) throw std::invalid_argument("A must be square");
12     Matrix L(n, std::vector<double>(n, 0.0));
13     const double eps = 1e-12;
14 
15     for (int i = 0; i < n; ++i) {
16         for (int j = 0; j <= i; ++j) {
17             double sum = A[i][j];
18             for (int k = 0; k < j; ++k) sum -= L[i][k] * L[j][k];
19             if (i == j) {
20                 if (sum <= eps) throw std::runtime_error("Matrix is not SPD (non-positive pivot)");
21                 L[i][j] = std::sqrt(sum);
22             } else {
23                 L[i][j] = sum / L[j][j];
24             }
25         }
26     }
27     return L;
28 }
29 
30 std::vector<double> forward_subst_L(const Matrix& L, const std::vector<double>& b) {
31     int n = (int)L.size();
32     std::vector<double> y(n, 0.0);
33     for (int i = 0; i < n; ++i) {
34         double sum = b[i];
35         for (int j = 0; j < i; ++j) sum -= L[i][j] * y[j];
36         y[i] = sum / L[i][i];
37     }
38     return y;
39 }
40 
41 std::vector<double> back_subst_LT(const Matrix& L, const std::vector<double>& y) {
42     int n = (int)L.size();
43     std::vector<double> x(n, 0.0);
44     for (int i = n - 1; i >= 0; --i) {
45         double sum = y[i];
46         for (int j = i + 1; j < n; ++j) sum -= L[j][i] * x[j]; // use L^T
47         x[i] = sum / L[i][i];
48     }
49     return x;
50 }
51 
52 int main() {
53     // SPD example matrix (symmetric and positive definite)
54     Matrix A = {
55         {25.0, 15.0, -5.0},
56         {15.0, 18.0,  0.0},
57         {-5.0,  0.0, 11.0}
58     };
59     std::vector<double> b = {35.0, 33.0, 6.0};
60 
61     try {
62         Matrix L = cholesky_decompose(A);
63         auto y = forward_subst_L(L, b);
64         auto x = back_subst_LT(L, y);
65 
66         std::cout << std::fixed << std::setprecision(6);
67         std::cout << "Solution x:";
68         for (double xi : x) std::cout << ' ' << xi; std::cout << "\n";
69 
70         // Optional: log-det(A) = 2 * sum(log(L_ii))
71         double logdetA = 0.0;
72         for (int i = 0; i < (int)L.size(); ++i) logdetA += 2.0 * std::log(L[i][i]);
73         std::cout << "log(det(A)) = " << logdetA << "\n";
74     } catch (const std::exception& e) {
75         std::cerr << "Error: " << e.what() << "\n";
76     }
77     return 0;
78 }
79

This program computes the Cholesky factor L of an SPD matrix A (A = L L^T) and solves Ax = b using two triangular solves. It also demonstrates how to compute log(det(A)) efficiently from the diagonal of L, which is useful for probabilistic models and numerical stability.

Time: O(n^3) for factorization; O(n^2) per solveSpace: O(n^2) for L and O(n) for temporaries

1	#include <iostream>
2	#include <vector>
3	#include <cmath>
4	#include <iomanip>
5	#include <stdexcept>
6	#include <limits>
7	#include <algorithm>
8
9	using Matrix = std::vector<std::vector<double>>;
10
11	struct LUResult {
12	Matrix LU; // Combined L (unit lower) and U (upper)
13	std::vector<int> P; // Row permutation: Pb = b[P[i]]
14	int sign; // Sign of the permutation (+1 or -1)
15	};
16
17	// Perform in-place LU with partial pivoting on a copy of A
18	LUResult lu_decompose(const Matrix& A) {
19	int n = (int)A.size();
20	if (n == 0 \|\| (int)A[0].size() != n) throw std::invalid_argument("A must be square");
21	Matrix LU = A; // will hold both L and U
22	std::vector<int> P(n);
23	for (int i = 0; i < n; ++i) P[i] = i;
24	int sign = 1;
25	const double eps = 1e-12;
26
27	for (int k = 0; k < n; ++k) {
28	// Pivot: find row with max \|LU[i][k]\| for i >= k
29	int piv = k;
30	double maxabs = std::abs(LU[k][k]);
31	for (int i = k + 1; i < n; ++i) {
32	double val = std::abs(LU[i][k]);
33	if (val > maxabs) { maxabs = val; piv = i; }
34	}
35	if (maxabs < eps) throw std::runtime_error("Matrix is singular to working precision");
36	if (piv != k) {
37	std::swap(LU[piv], LU[k]);
38	std::swap(P[piv], P[k]);
39	sign = -sign;
40	}
41	// Elimination
42	for (int i = k + 1; i < n; ++i) {
43	LU[i][k] /= LU[k][k]; // multiplier stored in L part
44	double lik = LU[i][k];
45	for (int j = k + 1; j < n; ++j) {
46	LU[i][j] -= lik * LU[k][j];
47	}
48	}
49	}
50	return {LU, P, sign};
51	}
52
53	// Solve Ly = Pb (L is unit-lower inside LU)
54	std::vector<double> forward_subst(const Matrix& LU, const std::vector<int>& P, const std::vector<double>& b) {
55	int n = (int)LU.size();
56	std::vector<double> y(n);
57	for (int i = 0; i < n; ++i) {
58	double sum = b[P[i]]; // apply permutation P to b
59	for (int j = 0; j < i; ++j) sum -= LU[i][j] * y[j];
60	y[i] = sum; // L has unit diagonal
61	}
62	return y;
63	}
64
65	// Solve Ux = y (U is upper-triangular part of LU)
66	std::vector<double> back_subst(const Matrix& LU, const std::vector<double>& y) {
67	int n = (int)LU.size();
68	std::vector<double> x(n);
69	const double eps = 1e-14;
70	for (int i = n - 1; i >= 0; --i) {
71	double sum = y[i];
72	for (int j = i + 1; j < n; ++j) sum -= LU[i][j] * x[j];
73	double uii = LU[i][i];
74	if (std::abs(uii) < eps) throw std::runtime_error("Singular U on back substitution");
75	x[i] = sum / uii;
76	}
77	return x;
78	}
79
80	// Convenience: solve Ax = b via PA = LU
81	std::vector<double> lu_solve(const Matrix& A, const std::vector<double>& b) {
82	auto lu = lu_decompose(A);
83	auto y = forward_subst(lu.LU, lu.P, b);
84	auto x = back_subst(lu.LU, y);
85	return x;
86	}
87
88	// Extract determinant quickly from LU
89	double lu_determinant(const LUResult& lu) {
90	double det = (double)lu.sign;
91	for (int i = 0; i < (int)lu.LU.size(); ++i) det *= lu.LU[i][i];
92	return det;
93	}
94
95	int main() {
96	// Example: Solve Ax = b and compute det(A)
97	Matrix A = {
98	{ 2.0, 1.0, 1.0},
99	{ 4.0, -6.0, 0.0},
100	{-2.0, 7.0, 2.0}
101	};
102	std::vector<double> b = {5.0, -2.0, 9.0};
103
104	try {
105	auto lu = lu_decompose(A);
106	auto y = forward_subst(lu.LU, lu.P, b);
107	auto x = back_subst(lu.LU, y);
108
109	std::cout << std::fixed << std::setprecision(6);
110	std::cout << "Solution x:";
111	for (double xi : x) std::cout << " " << xi; std::cout << "\n";
112
113	double detA = lu_determinant(lu);
114	std::cout << "det(A) = " << detA << "\n";
115	} catch (const std::exception& e) {
116	std::cerr << "Error: " << e.what() << "\n";
117	}
118	return 0;
119	}
120