⚙️AlgorithmIntermediate

Stratified & Latin Hypercube Sampling

Key Points

•
Stratified sampling reduces Monte Carlo variance by dividing the domain into non-overlapping regions (strata) and sampling within each region.
•
Latin Hypercube Sampling (LHS) ensures uniform coverage in every dimension by taking exactly one sample from each 1/n interval per dimension and permuting independently across dimensions.
•
Both methods keep marginal distributions uniform while improving coverage, leading to more stable estimates for the same number of samples.
•
In practice, stratified sampling needs correct weighting when strata have different sizes; otherwise estimates become biased.
•
LHS is especially effective in moderate dimensions when the function varies smoothly or monotonically along each coordinate.
•
Time complexity for generating n samples in d dimensions is O(n d) and space complexity is typically O(n d).
•
Random jitter inside each stratum prevents grid artifacts and preserves unbiasedness.
•
Use independent shuffles per dimension for LHS; reusing the same permutation can reintroduce unwanted correlations.

Prerequisites

→Uniform random number generation — Sampling within strata and jittering rely on drawing uniform random variables.
→Basic probability and expectation — Understanding unbiased estimators and variance reduction requires the concepts of expectation and variance.
→Monte Carlo integration — Stratification and LHS are alternative designs for Monte Carlo estimators.
→Permutations and shuffling (Fisher–Yates) — LHS construction needs independent random permutations per dimension.
→Asymptotic notation — To analyze time and space complexity and compare methods using O(·) notation.
→Numerical integration over intervals and boxes — Provides context for what is being estimated and why coverage matters.

Detailed Explanation

Tap terms for definitions

01Overview

Hook: Imagine surveying a city: if you only ask people downtown at noon, you miss commuters and night-shift workers. A better plan is to split the city into neighborhoods and sample each one. That way, your average reflects the whole city. Concept: Stratified sampling formalizes this idea for Monte Carlo—split the domain into strata, then sample within each stratum and combine results with appropriate weights. Latin Hypercube Sampling (LHS) goes a step further for multi-dimensional problems by enforcing even coverage along every coordinate, so you don’t accidentally cluster samples in the same x- or y-interval. Example: To estimate an integral over [0,1], we can divide it into n equal subintervals and pick one random point per subinterval. In 2D, stratified sampling might use an $s×s$ grid; in d dimensions, LHS chooses exactly one point from each of the n subintervals along every axis, pairing them via independent permutations to keep marginals uniform but reduce clumping. The result is lower estimator variance for many functions compared to naive Monte Carlo with the same sample budget.

02Intuition & Analogies

Hook: Filling a chocolate box evenly is easier if you place one piece in each slot rather than tossing them randomly and hoping every slot gets filled. Stratified and Latin hypercube sampling do exactly that for the space of possibilities. Concept: Random sampling can bunch points in some regions and leave gaps in others purely by chance. Stratification is like drawing a grid over the space and promising yourself you’ll visit every cell. That promise reduces randomness where it hurts (coverage) while keeping enough randomness (jitter) to avoid bias. LHS adds a clever twist for high dimensions: instead of a full grid (which explodes combinatorially), it ensures one sample per 1/n slice of each axis and then shuffles those slices independently per dimension. That way, each coordinate is well spread, but you still get n total samples, not $n^d$ . Example: Suppose you’re testing a video game across screen sizes and frame rates. Pure random tests might accidentally cluster around mid-range screens and 60 FPS. Stratified sampling would split screen sizes and frame rates into bins and test each bin. LHS would ensure that across n tests, every screen-size bin and every FPS bin appears exactly once, paired randomly, giving broad coverage without an exponential blowup of combinations.

03Formal Definition

Hook: Coverage without explosion. Concept: Let D be a measurable domain with measure

μ

. A stratification is a collection of disjoint sets {

S_{h}

}_{

h=1}^{H}

such that D =

⋃_{h = 1}^{H}

S_{h}

. If we draw

n_{h}

independent samples

X_{h, i}

uniformly from

S_{h}

, the stratified estimator for the mean of f over D is

\overset{μ}{^}

\sum_{h = 1}^{H}

W_{h}

\overset{ˉ}{f}_{h}

, where

W_{h}

μ

(

S_{h}

μ

(D) and

\overset{ˉ}{f}_{h}

= (1/

n_{h}

)

\sum_{i = 1}^{n_{h}}

X_{h, i}

). Under standard assumptions,

\overset{μ}{^}

is unbiased for

μ

(f) =

\frac{1}{μ ( D )}

\int_{D}

f(x)\, d

μ

(x). For equal-sized strata and one sample per stratum, this simplifies to the average of f evaluated at stratified points. LHS in d dimensions with n samples constructs points

X_{i}

= (

X_{i 1}

\dots

X_{i d}

) in [0,1]^{d} such that for each coordinate j, the set {

X_{1 j}

\dots

X_{nj}

} contains exactly one element from each interval ((k-1)/n, k/n],

k = 1

\dots

,n. One implementation samples

U_{ij}

\sim

Uniform

(0,1), draws independent permutations

π_{j}

of {1,

\dots

,n}, and sets

X_{ij}

= (

π_{j}

(i) -

U_{ij}

)/n. This guarantees uniform marginals and improved space-filling properties while keeping the total number of samples n.

04When to Use

Hook: When randomness alone leaves holes, force coverage. Concept: Use stratified sampling whenever your function’s behavior differs across regions and you can define meaningful strata (e.g., time-of-day segments, spatial tiles, parameter bins). It excels in Monte Carlo integration and simulation where variance matters: rendering (anti-aliasing, soft shadows), rare-event estimation with region-aware allocation, and survey sampling with known subpopulations. LHS is ideal for moderate-dimensional (say d up to a few dozen) uncertainty quantification and design of experiments—calibration, sensitivity analysis, and surrogate modeling—because it spreads samples evenly in each coordinate without an exponential grid. Example: • Estimating \in $t_{[0,1]^{d}$ } f(x) dx when f changes faster near boundaries—use more/unequal strata there. • Running n expensive simulations that depend on d uncertain inputs—use LHS to cover each input’s range well with just n runs. • Generating blue-noise-like sample sets for 2D images—use a jittered grid (stratified 2D) for aesthetically uniform points.

⚠️Common Mistakes

Hook: Good coverage can be undone by small implementation slips. Concept: Common pitfalls include (1) forgetting weights when strata have different sizes, which biases the estimator; (2) using too few samples per stratum, making within-stratum variance estimates unstable; (3) reusing the same permutation across dimensions in LHS, which creates artificial correlations; (4) skipping random jitter and placing points on cell centers, which can bias estimates if f is aligned with the grid; (5) not randomizing across repeated trials—LHS without re-randomization can look deceptively precise; (6) mismatched boundaries where samples fall outside strata because of floating-point rounding; and (7) assuming LHS samples are independent—they are not, and this affects variance formulas. Example fixes: • Always compute $W_{h}$ from geometric/measure proportions and multiply stratum means accordingly. • For unequal strata or heterogeneous variability, use Neyman allocation ( $n_{h}$ \propto $W_{h}$ \sigm $a_{h}$ ) rather than equal $n_{h}$ . • In LHS, generate an independent shuffle for each dimension and a fresh set of jitters per run. • Use half-open intervals to avoid overlap, and clamp numerically to [0,1].

Key Formulas

Population mean over a domain

μ (f) = \frac{1}{μ ( D )} \int_{D} f (x) d μ (x)

Explanation: This defines the true mean value of f over domain D with respect to measure $μ$ . Monte Carlo methods aim to estimate this quantity.

Stratified estimator

\overset{μ}{^}_{strat} = h = 1 \sum H W_{h} \overset{ˉ}{f}_{h}, \overset{ˉ}{f}_{h} = \frac{1}{n _{h}} i = 1 \sum n_{h} f (X_{h, i}), W_{h} = \frac{μ ( S _{h} )}{μ ( D )}

Explanation: Average within each stratum and combine using stratum weights. This estimator is unbiased if $X_{h, i}$ are sampled uniformly within each $S_{h}$ .

Variance of stratified estimator

Var (\overset{μ}{^}_{strat}) = h = 1 \sum H W_{h}^{2} \frac{σ _{h}^{2}}{n _{h}}

Explanation: The total variance is the weighted sum of within-stratum variances divided by their sample counts. It is typically smaller than naive Monte Carlo when strata isolate variability.

Neyman allocation

n_{h}^{⋆} = n \cdot \frac{W _{h} σ _{h}}{\sum _{k = 1}^{H} W _{k} σ _{k}}

Explanation: Given total budget n, allocating samples proportional to $W_{h}$ times the stratum standard deviation $σ_{h}$ minimizes the estimator variance.

Baseline Monte Carlo mean and variance

\overset{ˉ}{Y}_{MC} = \frac{1}{n} i = 1 \sum n f (X_{i}), Var (\overset{ˉ}{Y}_{MC}) = \frac{σ ^{2}}{n}

Explanation: With i.i.d. uniform samples, the sample mean is unbiased and its variance decreases as 1/n. Stratification aims to reduce the constant factor compared to this baseline.

LHS construction

X_{ij} = \frac{π _{j} ( i ) - U _{ij}}{n}, U_{ij} \sim Uniform (0, 1)

Explanation: For each dimension j, pick an independent permutation $π_{j}$ and jitter $U_{ij}$ . This guarantees exactly one sample in each 1/n interval of each coordinate.

Variance comparison (monotone case)

Var (\overset{μ}{^}_{LHS}) \leq Var (\overset{ˉ}{Y}_{MC}) for functions monotone in each coordinate

Explanation: For many monotone functions, LHS does not increase variance and often reduces it compared to i.i.d. Monte Carlo. The inequality formalizes this for a common class.

Complexity of LHS/stratified generation

T (n, d) = O (n d), S (n, d) = O (n d)

Explanation: Generating n points in d dimensions with stratification or LHS takes time proportional to the number of coordinates produced and linear space to store them.

Approximate confidence interval

\overset{μ}{^} \pm z_{α /2} Var (\overset{μ}{^})

Explanation: Using an estimate of the stratified variance and the normal quantile z, you can form an approximate (1- $α$ ) confidence interval for the mean.

Weights for unequal strata

W_{h} = \frac{μ ( S _{h} )}{\sum _{k = 1}^{H} μ ( S _{k} )}

Explanation: When strata have different sizes, compute weights from their measures to keep the estimator unbiased.

Complexity Analysis

Generating stratified or Latin Hypercube samples is linear in the number of coordinates you produce. For n samples in d dimensions, you create n·d uniform jitters, perform d independent permutations of size n (each O(n)), and combine them, yielding time complexity O(n d). The dominant costs are the random number generation and the d shuffles, which are typically implemented in O(n) using Fisher–Yates with high constant factors due to RNG calls. Space complexity depends on whether you stream points or store them. If you allocate the full sample matrix, it is O(n d). If memory is constrained, you can generate per dimension on the fly: store one permutation of size n at a time (O(n)) and write points row by row, reducing peak memory to O(max(n, d)). For 2D s×s jittered grids (a common stratified case), time is O(

s^{2}

) and space is O(

s^{2}

) if you store points, or O(1) if you stream them. Estimator computation (e.g., evaluating f at each point and averaging) adds O(n d) for coordinate handling plus the cost of evaluating f, often O(n) if f is O(1) per point. Compared to naive Monte Carlo (also O(n d)), stratified/LHS methods keep the same asymptotic complexity but typically achieve lower variance per sample, meaning they can reach a target error tolerance with fewer n in practice.

Code Examples

1D Stratified Sampling to Estimate an Integral on [0,1]

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Function to integrate: f(x) = exp(-x^2)
5 static inline double f(double x) { return exp(-x * x); }
6 
7 // Plain Monte Carlo estimator with n samples on [0,1]
8 double mc_integral_1d(size_t n, mt19937_64 &rng) {
9     uniform_real_distribution<double> U(0.0, 1.0);
10     double sum = 0.0;
11     for (size_t i = 0; i < n; ++i) {
12         double x = U(rng);
13         sum += f(x);
14     }
15     return sum / static_cast<double>(n);
16 }
17 
18 // Stratified estimator with n strata (one sample per stratum) on [0,1]
19 double stratified_integral_1d(size_t n, mt19937_64 &rng) {
20     uniform_real_distribution<double> U(0.0, 1.0);
21     double sum = 0.0;
22     for (size_t k = 0; k < n; ++k) {
23         // Stratum interval: ((k)/n, (k+1)/n]; use jitter u in (0,1]
24         double u = U(rng);              // jitter in [0,1)
25         double x = (k + u) / static_cast<double>(n); // uniform in stratum
26         sum += f(x);
27     }
28     // All strata are equal length => weight is 1/n for each
29     return sum / static_cast<double>(n);
30 }
31 
32 int main() {
33     ios::sync_with_stdio(false);
34     cin.tie(nullptr);
35 
36     const size_t n = 10000; // number of samples/strata
37     mt19937_64 rng(123456789ULL); // fixed seed for reproducibility
38 
39     double mc = mc_integral_1d(n, rng);
40     // Re-seed or use separate RNG stream if you want independent runs
41     mt19937_64 rng2(987654321ULL);
42     double strat = stratified_integral_1d(n, rng2);
43 
44     // True integral (for reference) is not elementary; compare approximations only
45     cout.setf(std::ios::fixed); cout << setprecision(6);
46     cout << "Plain MC estimate:       " << mc << "\n";
47     cout << "Stratified estimate:     " << strat << "\n";
48     cout << "(Both estimate ∫_0^1 exp(-x^2) dx)\n";
49     return 0;
50 }
51

We approximate ∫_0^1 e^{-x^2} dx. The plain Monte Carlo uses i.i.d. uniform x. The stratified version divides [0,1] into n equal subintervals and draws one jittered point from each, averaging f(x). Because coverage is enforced, stratified sampling typically shows lower variance than plain Monte Carlo for the same n.

Time: O(n)Space: O(1)

2D Stratified (Jittered Grid) Point Generator

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct Pt { double x, y; };
5 
6 // Generate an s x s jittered grid of points over [0,1]^2
7 vector<Pt> jittered_grid_2d(size_t s, mt19937_64 &rng) {
8     uniform_real_distribution<double> U(0.0, 1.0);
9     vector<Pt> pts; pts.reserve(s * s);
10     for (size_t i = 0; i < s; ++i) {
11         for (size_t j = 0; j < s; ++j) {
12             // Cell ((i)/s,(i+1)/s] x ((j)/s,(j+1)/s]
13             double ux = U(rng), uy = U(rng);
14             double x = (i + ux) / static_cast<double>(s);
15             double y = (j + uy) / static_cast<double>(s);
16             pts.push_back({x, y});
17         }
18     }
19     return pts;
20 }
21 
22 int main() {
23     ios::sync_with_stdio(false);
24     cin.tie(nullptr);
25 
26     size_t s = 16; // 16x16 = 256 points
27     mt19937_64 rng(42);
28     auto pts = jittered_grid_2d(s, rng);
29 
30     cout.setf(std::ios::fixed); cout << setprecision(5);
31     for (size_t k = 0; k < min<size_t>(pts.size(), 20); ++k) {
32         cout << pts[k].x << ", " << pts[k].y << "\n";
33     }
34     cerr << "Generated " << pts.size() << " stratified points over [0,1]^2\n";
35     return 0;
36 }
37

This generates an s×s stratified (jittered) grid in 2D. Each cell contributes exactly one random point, providing even spatial coverage suitable for rendering, sampling textures, or initializing k-means centers. You can stream the points without storing them if memory is tight.

Time: O(s^2)Space: O(s^2) (O(1) if streamed)

Latin Hypercube Sampling (LHS) in d Dimensions with Integral Estimation

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Generate n d-dimensional LHS samples in [0,1]^d
5 vector<vector<double>> lhs(size_t n, size_t d, mt19937_64 &rng) {
6     uniform_real_distribution<double> U(0.0, 1.0);
7     vector<vector<double>> X(n, vector<double>(d, 0.0));
8 
9     // For each dimension, make a permutation and jitter within bins
10     for (size_t j = 0; j < d; ++j) {
11         vector<size_t> pi(n);
12         iota(pi.begin(), pi.end(), 0);          // 0..n-1
13         shuffle(pi.begin(), pi.end(), rng);     // independent permutation per dimension
14         for (size_t i = 0; i < n; ++i) {
15             double u = U(rng);                  // jitter in [0,1)
16             // Map to ((k)/n,(k+1)/n] with k = pi[i]
17             X[i][j] = (static_cast<double>(pi[i]) + u) / static_cast<double>(n);
18             // Clamp (defensive): ensure in [0,1]
19             if (X[i][j] >= 1.0) X[i][j] = nextafter(1.0, 0.0);
20         }
21     }
22     return X;
23 }
24 
25 // Test function: f(x) = exp(-sum_j x_j). True mean over [0,1]^d is (1 - e^{-1})^d
26 static inline double f(const vector<double> &x) {
27     double s = 0.0;
28     for (double v : x) s += v;
29     return exp(-s);
30 }
31 
32 int main() {
33     ios::sync_with_stdio(false);
34     cin.tie(nullptr);
35 
36     size_t n = 5000; // number of samples
37     size_t d = 5;    // dimensionality
38     mt19937_64 rng(2024);
39 
40     auto X = lhs(n, d, rng);
41 
42     double sum = 0.0;
43     for (const auto &x : X) sum += f(x);
44     double estimate = sum / static_cast<double>(n);
45 
46     // True value for comparison: (1 - e^{-1})^d
47     double truth = pow(1.0 - exp(-1.0), static_cast<int>(d));
48 
49     cout.setf(std::ios::fixed); cout << setprecision(8);
50     cout << "LHS estimate: " << estimate << "\n";
51     cout << "True value:   " << truth << "\n";
52     cout << "Abs error:    " << fabs(estimate - truth) << "\n";
53     return 0;
54 }
55

This implements Latin Hypercube Sampling by using an independent random permutation per dimension and jittering within each 1/n bin. We estimate E[f(X)] for f(x) = exp(-∑ x_j) over [0,1]^d, whose true value is (1 - e^{-1})^d. LHS provides uniform marginals and improved coverage versus plain i.i.d. sampling for many functions.

Time: O(n d)Space: O(n d)

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Function to integrate: f(x) = exp(-x^2)
5	static inline double f(double x) { return exp(-x * x); }
6
7	// Plain Monte Carlo estimator with n samples on [0,1]
8	double mc_integral_1d(size_t n, mt19937_64 &rng) {
9	uniform_real_distribution<double> U(0.0, 1.0);
10	double sum = 0.0;
11	for (size_t i = 0; i < n; ++i) {
12	double x = U(rng);
13	sum += f(x);
14	}
15	return sum / static_cast<double>(n);
16	}
17
18	// Stratified estimator with n strata (one sample per stratum) on [0,1]
19	double stratified_integral_1d(size_t n, mt19937_64 &rng) {
20	uniform_real_distribution<double> U(0.0, 1.0);
21	double sum = 0.0;
22	for (size_t k = 0; k < n; ++k) {
23	// Stratum interval: ((k)/n, (k+1)/n]; use jitter u in (0,1]
24	double u = U(rng); // jitter in [0,1)
25	double x = (k + u) / static_cast<double>(n); // uniform in stratum
26	sum += f(x);
27	}
28	// All strata are equal length => weight is 1/n for each
29	return sum / static_cast<double>(n);
30	}
31
32	int main() {
33	ios::sync_with_stdio(false);
34	cin.tie(nullptr);
35
36	const size_t n = 10000; // number of samples/strata
37	mt19937_64 rng(123456789ULL); // fixed seed for reproducibility
38
39	double mc = mc_integral_1d(n, rng);
40	// Re-seed or use separate RNG stream if you want independent runs
41	mt19937_64 rng2(987654321ULL);
42	double strat = stratified_integral_1d(n, rng2);
43
44	// True integral (for reference) is not elementary; compare approximations only
45	cout.setf(std::ios::fixed); cout << setprecision(6);
46	cout << "Plain MC estimate: " << mc << "\n";
47	cout << "Stratified estimate: " << strat << "\n";
48	cout << "(Both estimate ∫_0^1 exp(-x^2) dx)\n";
49	return 0;
50	}
51

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	struct Pt { double x, y; };
5
6	// Generate an s x s jittered grid of points over [0,1]^2
7	vector<Pt> jittered_grid_2d(size_t s, mt19937_64 &rng) {
8	uniform_real_distribution<double> U(0.0, 1.0);
9	vector<Pt> pts; pts.reserve(s * s);
10	for (size_t i = 0; i < s; ++i) {
11	for (size_t j = 0; j < s; ++j) {
12	// Cell ((i)/s,(i+1)/s] x ((j)/s,(j+1)/s]
13	double ux = U(rng), uy = U(rng);
14	double x = (i + ux) / static_cast<double>(s);
15	double y = (j + uy) / static_cast<double>(s);
16	pts.push_back({x, y});
17	}
18	}
19	return pts;
20	}
21
22	int main() {
23	ios::sync_with_stdio(false);
24	cin.tie(nullptr);
25
26	size_t s = 16; // 16x16 = 256 points
27	mt19937_64 rng(42);
28	auto pts = jittered_grid_2d(s, rng);
29
30	cout.setf(std::ios::fixed); cout << setprecision(5);
31	for (size_t k = 0; k < min<size_t>(pts.size(), 20); ++k) {
32	cout << pts[k].x << ", " << pts[k].y << "\n";
33	}
34	cerr << "Generated " << pts.size() << " stratified points over [0,1]^2\n";
35	return 0;
36	}
37

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Generate n d-dimensional LHS samples in [0,1]^d
5	vector<vector<double>> lhs(size_t n, size_t d, mt19937_64 &rng) {
6	uniform_real_distribution<double> U(0.0, 1.0);
7	vector<vector<double>> X(n, vector<double>(d, 0.0));
8
9	// For each dimension, make a permutation and jitter within bins
10	for (size_t j = 0; j < d; ++j) {
11	vector<size_t> pi(n);
12	iota(pi.begin(), pi.end(), 0); // 0..n-1
13	shuffle(pi.begin(), pi.end(), rng); // independent permutation per dimension
14	for (size_t i = 0; i < n; ++i) {
15	double u = U(rng); // jitter in [0,1)
16	// Map to ((k)/n,(k+1)/n] with k = pi[i]
17	X[i][j] = (static_cast<double>(pi[i]) + u) / static_cast<double>(n);
18	// Clamp (defensive): ensure in [0,1]
19	if (X[i][j] >= 1.0) X[i][j] = nextafter(1.0, 0.0);
20	}
21	}
22	return X;
23	}
24
25	// Test function: f(x) = exp(-sum_j x_j). True mean over [0,1]^d is (1 - e^{-1})^d
26	static inline double f(const vector<double> &x) {
27	double s = 0.0;
28	for (double v : x) s += v;
29	return exp(-s);
30	}
31
32	int main() {
33	ios::sync_with_stdio(false);
34	cin.tie(nullptr);
35
36	size_t n = 5000; // number of samples
37	size_t d = 5; // dimensionality
38	mt19937_64 rng(2024);
39
40	auto X = lhs(n, d, rng);
41
42	double sum = 0.0;
43	for (const auto &x : X) sum += f(x);
44	double estimate = sum / static_cast<double>(n);
45
46	// True value for comparison: (1 - e^{-1})^d
47	double truth = pow(1.0 - exp(-1.0), static_cast<int>(d));
48
49	cout.setf(std::ios::fixed); cout << setprecision(8);
50	cout << "LHS estimate: " << estimate << "\n";
51	cout << "True value: " << truth << "\n";
52	cout << "Abs error: " << fabs(estimate - truth) << "\n";
53	return 0;
54	}
55