šŸŽ“How I Study AIHISA
šŸ“–Read
šŸ“„PapersšŸ“°BlogsšŸŽ¬Courses
šŸ’”Learn
šŸ›¤ļøPathsšŸ“šTopicsšŸ’”ConceptsšŸŽ“Shorts
šŸŽÆPractice
ā±ļøCoach🧩Problems🧠ThinkingšŸŽÆPrompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
āˆ‘MathAdvanced

KKT Conditions

Key Points

  • •
    KKT conditions generalize Lagrange multipliers to handle inequality constraints in constrained optimization problems.
  • •
    They consist of four parts: stationarity, primal feasibility, dual feasibility, and complementary slackness.
  • •
    For convex problems with a regularity condition (like Slater’s condition), KKT conditions are not just necessary but also sufficient for optimality.
  • •
    Lagrange multipliers (dual variables) can be interpreted as shadow prices that measure how much the objective would improve if a constraint were relaxed.
  • •
    You can numerically check KKT residuals to assess solution quality and stopping criteria in algorithms.
  • •
    For quadratic programs with linear constraints, KKT reduces to solving a linear system; active-set methods enumerate active constraints.
  • •
    Numerical implementations must tolerate small violations using tolerances because floating-point arithmetic is inexact.
  • •
    KKT underpins modern algorithms such as interior-point methods, SQP, and is used widely in machine learning (e.g., SVMs).

Prerequisites

  • →Multivariable calculus (gradients) — KKT stationarity uses gradients of the objective and constraints.
  • →Linear algebra — Solving KKT systems requires comfort with matrices, permutations, and linear independence (LICQ).
  • →Convex analysis — Sufficiency of KKT relies on convexity of objective and constraints and conditions like Slater.
  • →Lagrange multipliers (equality constraints) — KKT extends this classical method to inequalities; the Lagrangian is central.
  • →Numerical linear algebra — Stable and efficient solution of KKT systems (conditioning, pivoting, factorization).
  • →Quadratic forms and positive (semi)definiteness — QP structure and properties of the KKT matrix depend on Q being PSD/PD.
  • →Optimization duality basics — Understanding dual feasibility, strong duality, and shadow prices.
  • →Floating-point arithmetic and tolerances — Practical KKT checks use tolerances; understanding rounding errors avoids false failures.

Detailed Explanation

Tap terms for definitions

01Overview

Imagine trying to minimize a cost while respecting a set of do-not-violate rules. Karush–Kuhn–Tucker (KKT) conditions are the rulebook that tells you whether you have successfully minimized the cost without breaking any rules. They extend the classical Lagrange multiplier method (which handles equality constraints) to also support inequality constraints, thus covering a large class of real-world optimization problems. KKT conditions combine information from the objective function and the constraints into a single structure, the Lagrangian, and articulate four requirements—stationarity, primal feasibility, dual feasibility, and complementary slackness—that a candidate solution must satisfy at an optimum. When the problem is convex and mild regularity conditions hold (such as Slater’s condition), these requirements are not only necessary but also sufficient, meaning if they hold, you have found a global optimum. Hook: If you have ever pushed a box against a wall and felt the wall pushing back, you already have the right intuition—constraints push back with forces; KKT tells you when all pushes are perfectly balanced. Concept: KKT characterizes optimal points via gradients and multipliers attached to constraints, ensuring no direction can reduce the cost without violating constraints. Example: Minimizing distance to a point subject to staying on or beyond a line leads to a solution exactly on the line with a positive multiplier that balances the gradient of the objective and the constraint’s normal.

02Intuition & Analogies

Hook: Think of hiking downhill (minimizing elevation) in a valley with fences (constraints). You want the lowest point you can reach without crossing any fence. If you’re in the interior, you just walk until the slope is zero. If you hit a fence, you can only move along it. At the bottom, the slope that wants to pull you downhill is exactly canceled by the fence pushing back. Concept: The gradient of the objective points in the direction of steepest increase; the negative gradient points downhill. Constraints act like walls that block motion in certain directions. Lagrange multipliers are the magnitudes of the pushes from each active wall. Stationarity says ā€œsum of downhill pulls plus pushes from active walls equals zero.ā€ Primal feasibility means you didn’t cross any fence; dual feasibility says walls can only push (nonnegative multipliers), never pull; complementary slackness says a wall pushes you only if you’re exactly touching it (if you’re away from the wall, that multiplier must be zero). Example: Suppose you want the closest point to (1,2) but must stay in the region x + y ≄ 4. The unconstrained best is (1,2) (already the closest to itself), but it violates x + y ≄ 4. Projecting onto the line x + y = 4 gives (1.5, 2.5). The gradient of the squared distance at that point is (1,1), and the constraint’s normal is (āˆ’1,āˆ’1). A positive multiplier 1 on that constraint makes (1,1) + 1Ā·(āˆ’1,āˆ’1) = (0,0): perfect balance.

03Formal Definition

We consider the nonlinear program: minimize f(x) subject to gi​(x) ≤ 0 for i=1,...,m and hj​(x) = 0 for j=1,...,p, where f, gi​, hj​ are differentiable. Define the Lagrangian L(x, μ, Ī») = f(x) + āˆ‘i=1m​ μi​ gi​(x) + āˆ‘j=1p​ Ī»j​ hj​(x) with multipliers μ ∈ Rm and Ī» ∈ Rp. KKT conditions at a candidate optimum xāˆ— (with multipliers Ī¼āˆ—, Ī»āˆ—) are: 1) Stationarity: āˆ‡ f(xāˆ—) + āˆ‘i=1m​ μiāˆ—ā€‹ āˆ‡ gi​(xāˆ—) + āˆ‘j=1p​ Ī»jāˆ—ā€‹ āˆ‡ hj​(xāˆ—) = 0. 2) Primal feasibility: gi​(xāˆ—) ≤ 0 and hj​(xāˆ—) = 0. 3) Dual feasibility: μiāˆ—ā€‹ ≄ 0 for all i. 4) Complementary slackness: μiāˆ—ā€‹ gi​(xāˆ—) = 0 for all i. Under appropriate constraint qualifications (CQs), e.g., Slater’s condition for convex problems, these conditions are necessary for optimality. If, in addition, the problem is convex (f convex, gi​ convex, hj​ affine), KKT conditions are also sufficient: any (xāˆ—, Ī¼āˆ—, Ī»āˆ—) satisfying KKT is globally optimal, and strong duality holds (primal and dual optimal values are equal).

04When to Use

Use KKT conditions when you need optimality certificates or stopping criteria in constrained optimization. They are indispensable in: (1) Convex optimization with inequalities (e.g., quadratic programs, logistic regression with constraints, LASSO variants), to both certify and compute solutions. (2) Machine learning models such as SVMs, where KKT yields the dual problem and support vector characterization (complementary slackness identifies active constraints). (3) Engineering design and control (trajectory optimization with bounds, resource allocation), where multipliers quantify trade-offs (shadow prices). (4) Algorithm design, e.g., interior-point methods and sequential quadratic programming (SQP), where solving KKT linear systems is the core step. (5) Simple small-scale problems where active sets can be enumerated and KKT equations solved directly. Hook: If you want a numeric ā€œgreen lightā€ that your candidate is optimal, check KKT residuals. Example: In a small quadratic program with a handful of linear inequalities, testing every subset of possibly active constraints and solving the KKT system gives you the exact optimizer and multipliers.

āš ļøCommon Mistakes

  • Ignoring constraint qualifications: KKT may fail to be necessary if CQs (e.g., LICQ, Slater) do not hold; test feasibility and regularity before concluding optimality.
  • Wrong sign conventions: With inequalities g_i(x) \le 0, multipliers must satisfy \mu_i \ge 0 and the Lagrangian uses ā€œ+ \mu_i g_i(x)ā€. Reversing signs breaks stationarity or dual feasibility.
  • Mixing active/inactive logic: Complementary slackness requires \mu_i g_i(x)=0. If g_i(x) < 0 substantially but \mu_i > 0, or if g_i(x) \approx 0 but \mu_i < 0, KKT is violated.
  • Numerical tolerance issues: Floating-point noise means exact zeros are rare. Use tolerances (e.g., 1eāˆ’8) for feasibility and complementarity checks.
  • Treating KKT as sufficient in nonconvex problems: For nonconvex objectives, KKT are necessary (under CQs) but not sufficient; they can certify stationary points that are not global minima.
  • Singular KKT systems: Equality-constrained QPs can yield singular KKT matrices if constraints are linearly dependent; check rank or use regularization.
  • Forgetting equality constraints are unrestricted in sign: Their multipliers (\lambda) can be positive or negative; only inequality multipliers (\mu) must be nonnegative.
  • Overlooking scaling: Poor scaling of Q, A, and gradients can cause unstable solves; scale variables/constraints for numerical robustness.

Key Formulas

Primal problem

x∈Rnmin​f(x)s.t.gi​(x)≤0,Ā i=1..m,Ā hj​(x)=0,Ā j=1..p

Explanation: This is the general nonlinear program with inequality and equality constraints. It describes the optimization landscape KKT conditions apply to.

Lagrangian

L(x,μ,Ī»)=f(x)+i=1āˆ‘m​μi​gi​(x)+j=1āˆ‘p​λj​hj​(x)

Explanation: The Lagrangian blends the objective and constraints using multipliers. It is central to expressing KKT stationarity and deriving the dual.

Stationarity

āˆ‡f(xāˆ—)+i=1āˆ‘m​μiāˆ—ā€‹āˆ‡gi​(xāˆ—)+j=1āˆ‘p​λjāˆ—ā€‹āˆ‡hj​(xāˆ—)=0

Explanation: At an optimal point, the gradient of the Lagrangian with respect to x is zero. This means no feasible first-order descent direction exists.

Primal feasibility

gi​(xāˆ—)≤0,hj​(xāˆ—)=0

Explanation: The candidate solution must satisfy all inequality and equality constraints. Violating these invalidates optimality.

Dual feasibility

μiāˆ—ā€‹ā‰„0

Explanation: Inequality multipliers cannot be negative; constraints can only push back, not pull. Equality multipliers are unrestricted in sign.

Complementary slackness

μiāˆ—ā€‹gi​(xāˆ—)=0

Explanation: Each inequality is either active (gi​=0, possibly positive multiplier) or inactive (gi​<0, zero multiplier). Both cannot be positive simultaneously.

Dual function

q(μ,Ī»)=xinf​L(x,μ,Ī»)

Explanation: The dual function gives a lower bound on the primal optimal value for any feasible multipliers. Maximizing it yields the dual problem.

Dual problem

μ≄0,Ā Ī»max​q(μ,Ī»)

Explanation: The dual problem searches for the best lower bound via multipliers. Under strong duality, its optimal value equals the primal’s.

Slater's condition

∃ x:Ā gi​(x)<0Ā āˆ€i,Ā hj​(x)=0Ā āˆ€j

Explanation: This strict feasibility condition (for convex problems) ensures strong duality and that KKT conditions are necessary and sufficient.

KKT system for equality-constrained QP

[QAA​​AAāŠ¤ā€‹0​][xĪ»A​​]=[āˆ’cbA​​]

Explanation: For minimizing 21​ xT Q x + cT x with active constraints A_A x = b_A, this linear system yields x and active multipliers. It is the core of active-set methods.

KKT residual norm

ā€‹ā€‹āˆ‡x​L(x,μ,Ī»)max{g(x),0}h(x)μ∘g(x)​​​2​

Explanation: A practical stopping metric: near zero indicates approximate satisfaction of KKT. The max{g,0} term measures inequality violations only when positive.

Saddle-point characterization

L(xāˆ—,μ,Ī»)≤L(xāˆ—,Ī¼āˆ—,Ī»āˆ—)≤L(x,Ī¼āˆ—,Ī»āˆ—)

Explanation: At optimality for convex problems, (x*, mu*, lambda*) forms a saddle point of the Lagrangian. This underlies strong duality and sufficiency of KKT.

Complexity Analysis

Checking KKT conditions is typically inexpensive compared to solving the optimization problem. If gradients are available, evaluating stationarity requires computing āˆ‡f(x) and the weighted sum of constraint gradients, which costs O(n + mĀ·n + pĀ·n) = O((m + p) n) for dense functions, plus evaluating constraints O(m + p). Computing residual norms is linear in n and the number of constraints. If gradients are approximated via finite differences, cost increases by a factor of O(n) per function evaluation. Solving problems via KKT can be more expensive. For quadratic programs with k active constraints (treated as equalities), the KKT linear system has size (n + k) Ɨ (n + k). Direct Gaussian elimination with partial pivoting costs O((n + k)^3) time and O((n + k)^2) space for dense matrices. Active-set enumeration over m inequalities is exponential, O(2^m (n + k)^3), since each subset of constraints may require solving one KKT system; this is only practical for small m. Interior-point and SQP methods repeatedly form and solve linearized KKT systems. With dense linear algebra, each iteration is O(n3) (or O((n + m)^3) including constraints). With sparse structure and specialized solvers, practical complexity can be nearly linear in problem size for large, sparse systems. Memory usage scales with storing Jacobians/Hessians (O(n2) dense, less with sparsity). In summary: verifying KKT is cheap; globally solving via naive enumeration is exponential; scalable solvers rely on efficient, often sparse, KKT factorizations.

Code Examples

Numerically check KKT conditions for a simple convex problem
1#include <bits/stdc++.h>
2using namespace std;
3
4// Helper: L2 norm of a vector
5double norm2(const vector<double>& v) {
6 double s = 0.0; for (double x : v) s += x * x; return sqrt(s);
7}
8
9// Helper: vector add
10vector<double> add(const vector<double>& a, const vector<double>& b) {
11 vector<double> c(a.size());
12 for (size_t i = 0; i < a.size(); ++i) c[i] = a[i] + b[i];
13 return c;
14}
15
16// Helper: scalar * vector
17vector<double> scal(double alpha, const vector<double>& a) {
18 vector<double> c(a.size());
19 for (size_t i = 0; i < a.size(); ++i) c[i] = alpha * a[i];
20 return c;
21}
22
23int main() {
24 // Problem: minimize f(x) = (x1-1)^2 + (x2-2)^2
25 // subject to g(x) = 4 - x1 - x2 <= 0 (equivalently x1 + x2 >= 4)
26 // No equality constraints.
27
28 auto f = [](const vector<double>& x) {
29 return (x[0]-1)*(x[0]-1) + (x[1]-2)*(x[1]-2);
30 };
31 auto grad_f = [](const vector<double>& x) {
32 // āˆ‡f = 2*(x - [1,2])
33 return vector<double>{2.0*(x[0]-1.0), 2.0*(x[1]-2.0)};
34 };
35
36 auto g = [](const vector<double>& x) {
37 // g(x) = 4 - x1 - x2
38 return vector<double>{4.0 - x[0] - x[1]};
39 };
40 auto grad_g = [](const vector<double>& x) {
41 // āˆ‡g = [-1, -1]
42 (void)x; // unused
43 return vector<vector<double>>{ {-1.0, -1.0} };
44 };
45
46 // Analytic solution: x* = (1.5, 2.5), μ* = 1
47 vector<double> x = {1.5, 2.5};
48 vector<double> mu = {1.0}; // inequality multiplier ≄ 0
49 vector<double> lambda; // no equalities
50
51 // Compute KKT residuals
52 vector<double> st = grad_f(x); // stationarity accumulator
53 auto G = grad_g(x);
54 for (size_t i = 0; i < mu.size(); ++i) {
55 // add μ_i āˆ‡g_i
56 st = add(st, scal(mu[i], G[i]));
57 }
58
59 // Inequality and equality feasibility
60 vector<double> gv = g(x);
61 // For reporting, compute max(0, g)
62 vector<double> pos_g(gv.size());
63 for (size_t i = 0; i < gv.size(); ++i) pos_g[i] = max(0.0, gv[i]);
64
65 // Complementary slackness residuals: μ_i * g_i(x)
66 vector<double> comp(gv.size());
67 for (size_t i = 0; i < gv.size(); ++i) comp[i] = mu[i] * gv[i];
68
69 double tol = 1e-8;
70
71 cout << fixed << setprecision(8);
72 cout << "f(x*) = " << f(x) << "\n";
73 cout << "Stationarity residual ||āˆ‡f + Ī£ μ_i āˆ‡g_i||_2 = " << norm2(st) << "\n";
74 cout << "Inequality feasibility max(g,0) L2 norm = " << norm2(pos_g) << "\n";
75 cout << "Dual feasibility (all μ_i >= 0): "
76 << ((all_of(mu.begin(), mu.end(), [](double v){ return v >= -1e-12; })) ? "OK" : "FAIL") << "\n";
77 cout << "Complementary slackness ||μ ∘ g(x)||_2 = " << norm2(comp) << "\n";
78
79 bool kkt_ok = (norm2(st) <= tol) && (norm2(pos_g) <= tol)
80 && all_of(mu.begin(), mu.end(), [](double v){ return v >= -1e-12; })
81 && (norm2(comp) <= tol);
82 cout << "KKT satisfied within tol? " << (kkt_ok ? "YES" : "NO") << "\n";
83 return 0;
84}
85

We define a 2D convex problem: minimize squared distance to (1,2) subject to x1 + x2 ≄ 4. We compute the gradient of f, the constraint and its gradient, and the analytic solution (x*, μ*) = ((1.5, 2.5), 1). The program assembles the KKT residuals: stationarity, feasibility, dual feasibility, and complementary slackness. All residuals are zero (up to numerical tolerance), confirming KKT optimality.

Time: O(n + mĀ·n) for gradients and residuals (here n=2, m=1).Space: O(n + m) to store vectors and residuals.
Solve a small convex quadratic program via KKT active-set enumeration
1#include <bits/stdc++.h>
2using namespace std;
3
4// Basic dense linear algebra helpers for small systems
5using Mat = vector<vector<double>>;
6using Vec = vector<double>;
7
8Vec matVec(const Mat& A, const Vec& x) {
9 size_t n = A.size(), m = x.size();
10 Vec y(n, 0.0);
11 for (size_t i = 0; i < n; ++i)
12 for (size_t j = 0; j < m; ++j) y[i] += A[i][j] * x[j];
13 return y;
14}
15
16Mat transpose(const Mat& A) {
17 size_t n = A.size(), m = A[0].size();
18 Mat AT(m, vector<double>(n));
19 for (size_t i = 0; i < n; ++i)
20 for (size_t j = 0; j < m; ++j) AT[j][i] = A[i][j];
21 return AT;
22}
23
24// Solve A x = b via Gaussian elimination with partial pivoting
25bool solveLinearSystem(Mat A, Vec b, Vec& x) {
26 const double EPS = 1e-12;
27 int n = (int)A.size();
28 for (int i = 0; i < n; ++i) A[i].push_back(b[i]); // augment
29 // Forward elimination
30 for (int col = 0, row = 0; col < n && row < n; ++col, ++row) {
31 // Pivot
32 int piv = row;
33 for (int i = row+1; i < n; ++i)
34 if (fabs(A[i][col]) > fabs(A[piv][col])) piv = i;
35 if (fabs(A[piv][col]) < EPS) return false; // singular
36 swap(A[piv], A[row]);
37 // Normalize
38 double div = A[row][col];
39 for (int j = col; j <= n; ++j) A[row][j] /= div;
40 // Eliminate
41 for (int i = 0; i < n; ++i) if (i != row) {
42 double f = A[i][col];
43 for (int j = col; j <= n; ++j) A[i][j] -= f * A[row][j];
44 }
45 }
46 x.assign(n, 0.0);
47 for (int i = 0; i < n; ++i) x[i] = A[i][n];
48 return true;
49}
50
51// Build KKT system for equality-constrained QP: minimize 1/2 x^T Q x + c^T x s.t. Aeq x = beq
52// Returns (x, lambda) by solving [Q A^T; A 0][x; lambda] = [-c; beq]
53bool solveEqualityConstrainedQP(const Mat& Q, const Vec& c, const Mat& Aeq, const Vec& beq,
54 Vec& x, Vec& lambda) {
55 int n = (int)Q.size();
56 int k = (int)Aeq.size();
57 int N = n + k;
58 Mat KKT(N, vector<double>(N, 0.0));
59 // Top-left: Q
60 for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) KKT[i][j] = Q[i][j];
61 // Top-right: A^T
62 for (int i = 0; i < n; ++i) for (int j = 0; j < k; ++j) KKT[i][n + j] = Aeq[j][i];
63 // Bottom-left: A
64 for (int i = 0; i < k; ++i) for (int j = 0; j < n; ++j) KKT[n + i][j] = Aeq[i][j];
65 // Bottom-right: zeros
66 // Build RHS: [-c; beq]
67 Vec rhs(N, 0.0);
68 for (int i = 0; i < n; ++i) rhs[i] = -c[i];
69 for (int i = 0; i < k; ++i) rhs[n + i] = beq[i];
70 Vec sol;
71 if (!solveLinearSystem(KKT, rhs, sol)) return false;
72 x.assign(sol.begin(), sol.begin() + n);
73 lambda.assign(sol.begin() + n, sol.end());
74 return true;
75}
76
77// Active-set enumeration for Ax <= b (inequalities). We assume convex Q (symmetric PSD).
78struct QP {
79 Mat Q; // n x n (symmetric PSD)
80 Vec c; // n
81 Mat A; // m x n
82 Vec b; // m
83};
84
85int main() {
86 // Example: minimize (x-1)^T (x-1) with constraint x1 + x2 >= 4
87 // Write as: minimize 1/2 x^T Q x + c^T x, with Q = 2I, c = -2*[1,2]
88 // Constraint g(x) = A x - b <= 0 as -x1 - x2 + 4 <= 0 -> A = [-1 -1], b = -4
89
90 QP prob;
91 int n = 2, m = 1;
92 prob.Q = {{2.0, 0.0}, {0.0, 2.0}}; // 2I
93 prob.c = {-2.0*1.0, -2.0*2.0}; // -2a
94 prob.A = {{-1.0, -1.0}};
95 prob.b = {-4.0};
96
97 double bestObj = numeric_limits<double>::infinity();
98 Vec bestX, bestMu;
99 const double tol = 1e-8;
100
101 // Enumerate all subsets of constraints as candidate active sets
102 for (int mask = 0; mask < (1<<m); ++mask) {
103 // Build A_active, b_active from active constraints (those with bit = 1)
104 vector<int> activeIdx;
105 for (int i = 0; i < m; ++i) if (mask & (1<<i)) activeIdx.push_back(i);
106 Mat Aeq; Vec beq;
107 for (int idx : activeIdx) { Aeq.push_back(prob.A[idx]); beq.push_back(prob.b[idx]); }
108
109 Vec x, lambda_act;
110 if (!solveEqualityConstrainedQP(prob.Q, prob.c, Aeq, beq, x, lambda_act)) continue; // singular -> skip
111
112 // Check primal feasibility: Ax - b <= 0 with tolerance
113 bool feasible = true;
114 for (int i = 0; i < m; ++i) {
115 double gi = prob.A[i][0]*x[0] + prob.A[i][1]*x[1] - prob.b[i];
116 if (gi > tol) { feasible = false; break; }
117 }
118 if (!feasible) continue;
119
120 // Dual feasibility for active inequality multipliers: mu >= 0
121 // Here, the Lagrangian uses + mu^T (A x - b), so mu = lambda_act for active constraints
122 bool dual_ok = true;
123 for (double mu_i : lambda_act) if (mu_i < -1e-10) { dual_ok = false; break; }
124 if (!dual_ok) continue;
125
126 // Complementary slackness for inactive constraints: they must be strictly inactive (g < 0) or have mu = 0
127 // Since inactive mu = 0 by construction, just ensure g <= 0 already (feasible).
128
129 // Objective value: 1/2 x^T Q x + c^T x
130 double obj = 0.0;
131 for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) obj += 0.5 * x[i] * prob.Q[i][j] * x[j];
132 for (int i = 0; i < n; ++i) obj += prob.c[i] * x[i];
133
134 if (obj < bestObj) {
135 bestObj = obj;
136 bestX = x;
137 // Map active multipliers into full mu vector (inactive = 0)
138 bestMu.assign(m, 0.0);
139 for (size_t k = 0; k < activeIdx.size(); ++k) bestMu[activeIdx[k]] = lambda_act[k];
140 }
141 }
142
143 cout << fixed << setprecision(8);
144 if (bestX.empty()) {
145 cout << "No feasible KKT point found.\n";
146 return 0;
147 }
148
149 cout << "Best solution found by active-set enumeration:\n";
150 cout << "x* = [" << bestX[0] << ", " << bestX[1] << "]\n";
151 cout << "Objective f(x*) = " << bestObj << "\n";
152 cout << "Multipliers mu* = [" << bestMu[0] << "] (inactive constraints have mu=0)\n";
153
154 // Verify KKT stationarity: Q x + c + A^T mu = 0
155 Vec gradL = { prob.Q[0][0]*bestX[0] + prob.Q[0][1]*bestX[1] + prob.c[0],
156 prob.Q[1][0]*bestX[0] + prob.Q[1][1]*bestX[1] + prob.c[1] };
157 // Add A^T mu
158 for (int j = 0; j < n; ++j)
159 for (int i = 0; i < m; ++i) gradL[j] += prob.A[i][j] * bestMu[i];
160
161 // g(x)
162 Vec gx(m, 0.0);
163 for (int i = 0; i < m; ++i) gx[i] = prob.A[i][0]*bestX[0] + prob.A[i][1]*bestX[1] - prob.b[i];
164
165 // Residuals
166 auto norm2 = [](const Vec& v){ double s=0; for(double z:v) s+=z*z; return sqrt(s); };
167 Vec comp(m); for (int i = 0; i < m; ++i) comp[i] = bestMu[i] * gx[i];
168 Vec pos_g = gx; for (double& z : pos_g) z = max(0.0, z);
169
170 cout << "KKT check:\n";
171 cout << "||stationarity|| = " << norm2(gradL) << "\n";
172 cout << "||max(g,0)|| = " << norm2(pos_g) << "\n";
173 cout << "dual feasibility (mu >= 0): "
174 << (all_of(bestMu.begin(), bestMu.end(), [](double v){ return v >= -1e-12; }) ? "OK" : "FAIL") << "\n";
175 cout << "||mu ∘ g|| = " << norm2(comp) << "\n";
176
177 return 0;
178}
179

This program solves a tiny convex quadratic program with linear inequalities by enumerating all active sets. For each subset, it solves the equality-constrained KKT system [Q A^T; A 0][x; Ī»] = [āˆ’c; b] to obtain x and multipliers for those constraints, then checks primal and dual feasibility. The best feasible candidate is returned, and KKT residuals are reported. For our example, the solution is x* = (1.5, 2.5) with μ* = 1, matching the analytical KKT solution.

Time: O(2^m Ā· (n + k)^3) due to active-set enumeration and dense linear solves.Space: O((n + k)^2) for the KKT matrix during each solve; overall O(n^2 + m n).
#kkt conditions#lagrangian#complementary slackness#dual feasibility#slater condition#active set#quadratic program#kkt system#strong duality#convex optimization#stationarity#constraint qualification#licq#interior point#saddle point