⚙️AlgorithmIntermediate

Monte Carlo Estimation

Key Points

•
Monte Carlo estimation approximates an expected value by averaging function values at random samples drawn from a probability distribution.
•
The estimator $ $\overset{μ}{^}_{N}$ = $\frac{1}{N}$ $\sum_{i = 1}^{N}$ f( $X_{i}$ )$ is unbiased and its variance shrinks like $ $Var$ (f(X))/N$.
•
By the Law of Large Numbers, the Monte Carlo average converges to the true expectation as the number of samples increases.
•
By the Central Limit Theorem, the error is approximately normal with standard error $ $σ$ / $N$ $, enabling confidence intervals.
•
Importance sampling and antithetic variates can dramatically reduce variance without changing the estimator’s expectation.
•
Monte Carlo is especially powerful for high-dimensional integrals and problems with complex or unknown analytic solutions.
•
The time complexity is linear in the number of samples and usually the memory cost is constant.
•
Careful seeding, correct weighting, and support coverage are essential to avoid bias and numerical instability.

Prerequisites

→Basic probability and random variables — Understanding expectations, independence, and distributions is essential for why sample averages estimate means.
→Integral calculus — Expectations can be written as integrals; Monte Carlo approximates these integrals by sampling.
→Variance and Central Limit Theorem — Variance quantifies estimator spread, and the CLT justifies confidence intervals for Monte Carlo estimates.
→C++ random number generation — You must know how to seed and use std::mt19937_64 and distribution objects to implement Monte Carlo.
→Numerical stability and floating point — Log-domain computations and stable online variance are critical for rare events and long runs.

Detailed Explanation

Tap terms for definitions

01Overview

Monte Carlo estimation is a general technique to approximate expectations (averages) of random quantities using randomness itself. Suppose you want E[f(X)] for some function f and random variable X with distribution P, but you cannot compute it analytically. Monte Carlo draws independent samples $X_1$ , $X_2$ , ..., $X_N$ from P and computes the average (1/N) Σ f( $X_i$ ). This sample mean is a random number that, on average, equals the desired expectation and gets closer to it as you use more samples. The strength of Monte Carlo is its generality: it works for many distributions, dimensions, and functions—even when traditional calculus or closed-form solutions fail. It is widely used in physics (particle simulations), finance (option pricing), computer graphics (global illumination), machine learning (Bayesian inference), and engineering (reliability analysis). The accuracy improves as you increase N, but only at a rate proportional to 1/√N unless you apply variance reduction. You can also quantify uncertainty with confidence intervals using the Central Limit Theorem. Practically, Monte Carlo needs a high-quality pseudorandom number generator, a correct way to sample from P, and careful numerical handling when probabilities are very small or very large.

02Intuition & Analogies

Imagine you want to know the average height of trees in a huge forest but measuring all trees is impossible. A practical approach is to randomly pick N trees, measure their heights, and average them. If your sampling is fair, this average is a good estimate of the forest’s true average height, and as you measure more trees, the estimate becomes more reliable. Monte Carlo estimation does the same for mathematical expectations: instead of trees, you randomly sample outcomes $X_i$ from a distribution P, measure f( $X_i$ ), and average. The function f could be simple (like the square) or complicated (like a simulation that returns 1 if a system survives one year). Each sample is like a new tree measurement contributing to the overall average. The uncertainty in your final average is like the randomness in which trees you happened to pick; more samples reduce this uncertainty. Sometimes the forest has rare giant trees that dominate the average—if you rarely sample them, your average is noisy. Importance sampling is like deliberately walking to where giant trees are more common, then compensating with weights to keep the estimate fair. Antithetic variates are like measuring pairs of trees in opposite directions from a central point to cancel some randomness. Overall, Monte Carlo turns hard integrals and expectations into a counting-and-averaging game with a clear rule: sample fairly, average carefully, and use tricks to reduce noise.

03Formal Definition

Let (

Ω

F

, P) be a probability space and X

\sim

P a random variable taking values in

X

. For a measurable function f:

X

\to

R

with finite variance \(

σ^{2}

Var

(f(X)) <

\infty

\), the Monte Carlo estimator of \(

μ

E

[f(X)]\) using i.i.d. samples

X_{1}

\dots

X_{N}

\sim

P is \(

\overset{μ}{^}_{N}

\frac{1}{N} \sum_{i = 1}^{N}

X_{i}

)\). Then \(

E

[

\overset{μ}{^}_{N}

] =

μ

\) (unbiasedness) and \(

Var

(

\overset{μ}{^}_{N}

) =

σ^{2}

/N\). By the Strong Law of Large Numbers, \(

\overset{μ}{^}_{N}

\to

μ

\) almost surely as N

\to

\infty

. By the Central Limit Theorem, if \(

σ^{2}

\infty

\), then \(

N

(

\overset{μ}{^}_{N}

μ

)

d

N

(0,

σ^{2}

)\). In practice, \(

σ^{2}

\) is unknown and we estimate it with the sample variance \(

s^{2}

\frac{1}{N - 1} \sum_{i = 1}^{N}

(f(

X_{i}

) -

\overset{μ}{^}_{N}

)^2\), leading to an estimated standard error \(

SE

(

\overset{μ}{^}_{N}

)

\approx

N

\) and approximate confidence intervals. For importance sampling with proposal Q having density q and target density p, if p is absolutely continuous with respect to q and w(x)=p(x)/q(x), the estimator is \(

\overset{μ}{^}_{IS}

\frac{1}{N} \sum_{i = 1}^{N}

Y_{i}

) w(

Y_{i}

)\) with

Y_{i}

\sim

Q, which remains unbiased when \(

E_{Q}

[

∣ f (Y) w (Y) ∣

] <

\infty

\).

04When to Use

Use Monte Carlo estimation when the expectation or integral has no closed form or is too complicated for analytic techniques, particularly in high dimensions where grid-based numerical integration becomes infeasible. It is well-suited for: (1) High-dimensional integrals, such as computing volumes or Bayesian posterior expectations; (2) Stochastic simulations where f(X) encapsulates a complex process (e.g., queuing systems, reliability, epidemic spread); (3) Financial engineering tasks like pricing options and risk measures when payoff expectations under a risk-neutral measure are not solvable analytically; (4) Computer graphics global illumination, where the rendering equation is an integral over light paths; (5) Rare-event probabilities when combined with variance reduction like importance sampling; (6) Situations demanding error bars: Monte Carlo naturally provides standard errors and confidence intervals. Avoid naive Monte Carlo when the variance of f(X) is very large or infinite, or when sampling from P is itself extremely expensive; in such cases, use variance reduction (importance sampling, antithetic or control variates, stratification) or quasi–Monte Carlo. If the integrand is smooth and low-dimensional (say, 1–3 dimensions), deterministic quadrature (Simpson’s, Gauss–Legendre) may converge much faster than Monte Carlo.

⚠️Common Mistakes

Biased sampling: drawing from the wrong distribution or with dependence between samples without accounting for it, which breaks unbiasedness and LLN assumptions. 2) Incorrect weighting in importance sampling: forgetting the weight w = p/q, using unnormalized densities inconsistently, or choosing a proposal with lighter tails than the target so that weights explode, causing enormous variance. 3) Ignoring support mismatch: if q(x) = 0 where p(x)f( $x) ≠ 0$ , the importance sampler is biased or undefined. 4) Underestimating uncertainty: reporting the sample mean without a standard error or confidence interval, or computing variance with divisor N instead of N−1 for the unbiased estimator. 5) Numerical instability: computing weights in probability scale rather than log-scale for extreme events, leading to overflow/underflow. 6) Poor random number handling: not seeding reproducibly when needed, reusing the same seed across parallel workers, or mixing different PRNGs improperly. 7) Inefficient variance: using plain Monte Carlo for rare events where almost all samples contribute zero; importance sampling or stratification is needed. 8) Overinterpreting small N: early runs can look accurate by chance; always check convergence with increasing N or replicate runs to assess variability.

Key Formulas

Monte Carlo Estimator

\overset{μ}{^}_{N} = \frac{1}{N} i = 1 \sum N f (X_{i}), X_{i} \sim i.i.d. P

Explanation: Average the function f over N independent samples from P. This unbiased estimator converges to the true expectation as N grows.

Unbiasedness and Variance

E [\overset{μ}{^}_{N}] = μ, Var (\overset{μ}{^}_{N}) = \frac{Var ( f ( X ))}{N}

Explanation: The estimator’s expected value equals the true mean, and its variance decreases proportionally to 1/N. Doubling N reduces standard deviation by a factor of $ $2$ $.

Central Limit Theorem

N (\overset{μ}{^}_{N} - μ) d N (0, σ^{2})

Explanation: For large N, the scaled error is approximately normal, enabling confidence intervals using an estimate of $ $σ$ $.

Sample Variance

s^{2} = \frac{1}{N - 1} i = 1 \sum N (f (X_{i}) - \overset{μ}{^}_{N})^{2}

Explanation: Empirical variance with N−1 in the denominator to remove small-sample bias. Use s/ $N$ as the standard error estimate.

Importance Sampling Estimator

\overset{μ}{^}_{IS} = \frac{1}{N} i = 1 \sum N f (Y_{i}) \frac{p ( Y _{i} )}{q ( Y _{i} )}, Y_{i} \sim Q

Explanation: Draw from a proposal distribution Q but reweight by the likelihood ratio p/q to remain unbiased with potentially lower variance.

Weight in Log-Scale

w (y) = \frac{p ( y )}{q ( y )} = exp (lo g p (y) - lo g q (y))

Explanation: Computing weights via logs avoids underflow/overflow for extreme values. This is essential in rare-event estimation.

Bias-Variance Decomposition

MSE (\overset{μ}{^}) = Var (\overset{μ}{^}) + Bias (\overset{μ}{^})^{2}

Explanation: Overall error splits into variance and squared bias. Plain Monte Carlo is unbiased, so MSE equals variance.

Chebyshev's Inequality

P (∣ \overset{μ}{^}_{N} - μ ∣ \geq ε) \leq \frac{Var ( f ( X ))}{N ε ^{2}}

Explanation: A nonasymptotic bound on the probability of large deviations. It shows the $1/N$ variance decay explicitly.

Sample Size for Target Error

N \approx \frac{z _{1 - α /2}^{2} σ ^{2}}{ε ^{2}}

Explanation: To achieve half-width $ $ε$ $ with confidence $1- $α$ $, choose N using the normal quantile and variance estimate.

Antithetic Variates

\overset{μ}{^}_{anti} = \frac{1}{N} i = 1 \sum N /2 \frac{f ( Z _{i} ) + f ( - Z _{i} )}{2}

Explanation: Pairing symmetric samples induces negative correlation, often reducing variance without bias when the distribution is symmetric.

Complexity Analysis

The computational cost of plain Monte Carlo estimation is linear in the number of samples N. Each iteration draws one random sample and evaluates f at that sample, so the time complexity is O(N ·

C_{f}

), where

C_{f}

is the cost of a single function evaluation. If f is O(1) to compute, the overall time is O(N). The space complexity is typically O(1) because you can maintain running sums (and sums of squares) without storing all samples; retaining all samples would increase space to O(N), but this is unnecessary for the mean and variance. Variance-reduction techniques preserve the same big-O time while improving the constant factors in accuracy. For example, antithetic variates still require O(N) operations but can reduce the variance by a factor depending on the induced negative correlation. Importance sampling also runs in O(N), though each iteration includes evaluating a weight w(x)=p/q; the effective per-sample cost may rise slightly due to extra density computations. The key performance metric is error versus compute: the standard Monte Carlo error decays as O(1/

N

). Hence, to reduce error by a factor of 10, you need 100 times more samples if you do not use variance reduction. In parallel environments, Monte Carlo scales almost linearly because samples are independent: with P processors, ideal time is roughly O

\frac{N}{P}

. Practical bottlenecks include PRNG throughput, synchronization (e.g., reductions), and numerical stability (e.g., catastrophic cancellation when summing many values), which can be mitigated with techniques like Kahan summation. Importance sampling can suffer from heavy-tailed weights, effectively reducing the usable sample size; monitoring effective sample size (ESS) is recommended.

Code Examples

Plain Monte Carlo: Estimate ∫_0^1 e^{-x^2} dx with confidence interval

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Estimate I = E[f(U)] with U ~ Uniform(0,1), f(x) = exp(-x^2)
5 // Also compute 95% confidence interval using sample variance.
6 
7 int main() {
8     ios::sync_with_stdio(false);
9     cin.tie(nullptr);
10 
11     const size_t N = 1'000'000; // number of samples
12 
13     // Seed PRNG with high-resolution clock for variability
14     std::mt19937_64 rng(static_cast<uint64_t>(chrono::high_resolution_clock::now().time_since_epoch().count()));
15     std::uniform_real_distribution<double> unif(0.0, 1.0);
16 
17     auto f = [](double x) {
18         return exp(-x * x);
19     };
20 
21     // Online mean and variance via Welford's algorithm to improve numerical stability
22     long double mean = 0.0L;
23     long double M2 = 0.0L; // sum of squares of differences from the current mean
24 
25     for (size_t i = 1; i <= N; ++i) {
26         double x = unif(rng);
27         double fx = f(x);
28         long double delta = fx - mean;
29         mean += delta / (long double)i;
30         long double delta2 = fx - mean;
31         M2 += delta * delta2; // accumulates squared deviations
32     }
33 
34     long double sample_mean = mean; // \hat{\mu}
35     long double sample_var = M2 / (long double)(N - 1); // unbiased s^2
36     long double se = sqrt((double)sample_var / (double)N); // standard error
37 
38     // 95% normal-approx CI with z=1.96
39     const long double z = 1.96L;
40     long double lo = sample_mean - z * se;
41     long double hi = sample_mean + z * se;
42 
43     cout.setf(std::ios::fixed); cout << setprecision(8);
44     cout << "Estimate of ∫_0^1 e^{-x^2} dx: " << (double)sample_mean << "\n";
45     cout << "Standard error: " << (double)se << "\n";
46     cout << "95% CI: [" << (double)lo << ", " << (double)hi << "]\n";
47 
48     // For reference, true value ~ 0.746824 (erf(1)/2*sqrt(pi))
49     return 0;
50 }
51

We sample U ~ Uniform(0,1) and average f(U)=exp(-U^2), which equals the integral over [0,1]. Welford’s algorithm provides a numerically stable running variance, from which we compute a 95% confidence interval using the CLT.

Time: O(N)Space: O(1)

Antithetic variates: Estimate E[e^Z] with Z ~ N(0,1)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Compare plain Monte Carlo vs. antithetic variates for f(z)=exp(z), Z~N(0,1).
5 // True E[exp(Z)] = exp(1/2) ≈ 1.6487212707.
6 
7 struct Stats { long double mean; long double var; };
8 
9 // Compute mean and unbiased variance from a vector (for fair comparison of per-evaluation variability)
10 Stats summarize(const vector<long double>& v) {
11     size_t n = v.size();
12     long double mean = 0.0L, M2 = 0.0L;
13     for (size_t i = 0; i < n; ++i) {
14         long double x = v[i];
15         long double d = x - mean;
16         mean += d / (long double)(i + 1);
17         long double d2 = x - mean;
18         M2 += d * d2;
19     }
20     long double var = (n > 1) ? M2 / (long double)(n - 1) : 0.0L;
21     return {mean, var};
22 }
23 
24 int main() {
25     ios::sync_with_stdio(false);
26     cin.tie(nullptr);
27 
28     const size_t N = 1'000'000; // total function evaluations budget
29 
30     std::mt19937_64 rng(123456789ULL); // fixed seed for reproducibility
31     std::normal_distribution<double> normal(0.0, 1.0);
32 
33     auto f = [](double z) { return exp(z); };
34 
35     // Plain MC: use N evaluations
36     vector<long double> vals_plain; vals_plain.reserve(N);
37     for (size_t i = 0; i < N; ++i) {
38         double z = normal(rng);
39         vals_plain.push_back(f(z));
40     }
41     Stats S_plain = summarize(vals_plain);
42     long double mean_plain = S_plain.mean;
43     long double se_plain = sqrt((double)S_plain.var / (double)N);
44 
45     // Antithetic: use pairs (Z, -Z), each pair contributes (f(Z)+f(-Z))/2
46     // Keep overall function-evaluation budget the same by using N/2 pairs.
47     size_t pairs = N / 2;
48     vector<long double> pair_means; pair_means.reserve(pairs);
49     for (size_t i = 0; i < pairs; ++i) {
50         double z = normal(rng);
51         long double g = 0.5L * ( (long double)f(z) + (long double)f(-z) );
52         pair_means.push_back(g);
53     }
54     Stats S_anti = summarize(pair_means);
55     long double mean_anti = S_anti.mean; // estimator using N evaluations total
56     long double se_anti = sqrt((double)S_anti.var / (double)pairs);
57 
58     cout.setf(std::ios::fixed); cout << setprecision(8);
59     cout << "True E[e^Z] = exp(1/2) ≈ 1.64872127\n";
60     cout << "Plain MC:     mean = " << (double)mean_plain << ", SE ≈ " << (double)se_plain << "\n";
61     cout << "Antithetic:   mean = " << (double)mean_anti  << ", SE ≈ " << (double)se_anti  << "\n";
62     cout << "SE reduction factor (plain/anti): " << (double)(se_plain / se_anti) << "\n";
63 
64     return 0;
65 }
66

We estimate E[exp(Z)] using plain Monte Carlo and with antithetic variates by pairing Z with −Z from the symmetric normal distribution. Averaging f(Z) and f(−Z) induces negative correlation, reducing variance without changing the expectation. We report standard errors under an equal total function-evaluation budget.

Time: O(N)Space: O(N) for storing samples (can be reduced to O(1) with online updates)

Importance sampling: Estimate P(Z > 5) with Z ~ N(0,1)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Estimate a rare-event probability p = P(Z > 5) where Z ~ N(0,1).
5 // Naive Monte Carlo would require ~1/p samples; here we use importance sampling with Q = N(mu,1), mu=5.
6 // Weight: w(y) = p(y)/q(y) = exp(-mu*y + 0.5*mu*mu), computed in log-domain for stability.
7 
8 int main() {
9     ios::sync_with_stdio(false);
10     cin.tie(nullptr);
11 
12     const size_t N = 500'000; // number of importance samples
13     const double mu = 5.0;    // shift toward the rare event region
14 
15     std::mt19937_64 rng(2024ULL);
16     std::normal_distribution<double> q_dist(mu, 1.0);   // proposal Q = N(mu,1)
17     std::normal_distribution<double> p_dist(0.0, 1.0);  // for naive reference
18 
19     // Importance sampling estimator
20     long double mean_is = 0.0L, M2_is = 0.0L; // online stats for weighted indicator
21 
22     for (size_t i = 1; i <= N; ++i) {
23         double y = q_dist(rng);
24         // log w = -mu * y + 0.5 * mu^2 (derived from Gaussian density ratio)
25         long double logw = -mu * (long double)y + 0.5L * mu * mu;
26         long double w = expl(logw);
27         long double h = (y > 5.0) ? w : 0.0L; // indicator * weight
28 
29         long double delta = h - mean_is;
30         mean_is += delta / (long double)i;
31         long double delta2 = h - mean_is;
32         M2_is += delta * delta2;
33     }
34     long double var_is = M2_is / (long double)(N - 1);
35     long double se_is = sqrt((double)var_is / (double)N);
36 
37     // Naive MC for comparison (same N). Likely returns zero due to rarity.
38     size_t count_naive = 0;
39     for (size_t i = 0; i < N; ++i) {
40         double z = p_dist(rng);
41         if (z > 5.0) ++count_naive;
42     }
43     long double est_naive = (long double)count_naive / (long double)N;
44 
45     cout.setf(std::ios::fixed); cout << setprecision(12);
46     cout << "Importance sampling estimate P(Z>5): " << (double)mean_is << "\n";
47     cout << "Standard error (IS): " << (double)se_is << "\n";
48     cout << "Naive MC estimate (same N):        " << (double)est_naive << "\n";
49     cout << "Reference true value ≈ 2.867e-7 (for N(0,1))\n";
50 
51     return 0;
52 }
53

The rare event Z>5 has probability ≈ 2.87e−7, so naive Monte Carlo with moderate N will almost always see zero hits. Importance sampling draws from N(5,1), where such samples are common, and corrects with the weight w = p/q. We maintain online mean and variance to report a standard error. We compute weights in log-space for numerical stability.

Time: O(N)Space: O(1)

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Estimate I = E[f(U)] with U ~ Uniform(0,1), f(x) = exp(-x^2)
5	// Also compute 95% confidence interval using sample variance.
6
7	int main() {
8	ios::sync_with_stdio(false);
9	cin.tie(nullptr);
10
11	const size_t N = 1'000'000; // number of samples
12
13	// Seed PRNG with high-resolution clock for variability
14	std::mt19937_64 rng(static_cast<uint64_t>(chrono::high_resolution_clock::now().time_since_epoch().count()));
15	std::uniform_real_distribution<double> unif(0.0, 1.0);
16
17	auto f = [](double x) {
18	return exp(-x * x);
19	};
20
21	// Online mean and variance via Welford's algorithm to improve numerical stability
22	long double mean = 0.0L;
23	long double M2 = 0.0L; // sum of squares of differences from the current mean
24
25	for (size_t i = 1; i <= N; ++i) {
26	double x = unif(rng);
27	double fx = f(x);
28	long double delta = fx - mean;
29	mean += delta / (long double)i;
30	long double delta2 = fx - mean;
31	M2 += delta * delta2; // accumulates squared deviations
32	}
33
34	long double sample_mean = mean; // \hat{\mu}
35	long double sample_var = M2 / (long double)(N - 1); // unbiased s^2
36	long double se = sqrt((double)sample_var / (double)N); // standard error
37
38	// 95% normal-approx CI with z=1.96
39	const long double z = 1.96L;
40	long double lo = sample_mean - z * se;
41	long double hi = sample_mean + z * se;
42
43	cout.setf(std::ios::fixed); cout << setprecision(8);
44	cout << "Estimate of ∫_0^1 e^{-x^2} dx: " << (double)sample_mean << "\n";
45	cout << "Standard error: " << (double)se << "\n";
46	cout << "95% CI: [" << (double)lo << ", " << (double)hi << "]\n";
47
48	// For reference, true value ~ 0.746824 (erf(1)/2*sqrt(pi))
49	return 0;
50	}
51

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Compare plain Monte Carlo vs. antithetic variates for f(z)=exp(z), Z~N(0,1).
5	// True E[exp(Z)] = exp(1/2) ≈ 1.6487212707.
6
7	struct Stats { long double mean; long double var; };
8
9	// Compute mean and unbiased variance from a vector (for fair comparison of per-evaluation variability)
10	Stats summarize(const vector<long double>& v) {
11	size_t n = v.size();
12	long double mean = 0.0L, M2 = 0.0L;
13	for (size_t i = 0; i < n; ++i) {
14	long double x = v[i];
15	long double d = x - mean;
16	mean += d / (long double)(i + 1);
17	long double d2 = x - mean;
18	M2 += d * d2;
19	}
20	long double var = (n > 1) ? M2 / (long double)(n - 1) : 0.0L;
21	return {mean, var};
22	}
23
24	int main() {
25	ios::sync_with_stdio(false);
26	cin.tie(nullptr);
27
28	const size_t N = 1'000'000; // total function evaluations budget
29
30	std::mt19937_64 rng(123456789ULL); // fixed seed for reproducibility
31	std::normal_distribution<double> normal(0.0, 1.0);
32
33	auto f = [](double z) { return exp(z); };
34
35	// Plain MC: use N evaluations
36	vector<long double> vals_plain; vals_plain.reserve(N);
37	for (size_t i = 0; i < N; ++i) {
38	double z = normal(rng);
39	vals_plain.push_back(f(z));
40	}
41	Stats S_plain = summarize(vals_plain);
42	long double mean_plain = S_plain.mean;
43	long double se_plain = sqrt((double)S_plain.var / (double)N);
44
45	// Antithetic: use pairs (Z, -Z), each pair contributes (f(Z)+f(-Z))/2
46	// Keep overall function-evaluation budget the same by using N/2 pairs.
47	size_t pairs = N / 2;
48	vector<long double> pair_means; pair_means.reserve(pairs);
49	for (size_t i = 0; i < pairs; ++i) {
50	double z = normal(rng);
51	long double g = 0.5L * ( (long double)f(z) + (long double)f(-z) );
52	pair_means.push_back(g);
53	}
54	Stats S_anti = summarize(pair_means);
55	long double mean_anti = S_anti.mean; // estimator using N evaluations total
56	long double se_anti = sqrt((double)S_anti.var / (double)pairs);
57
58	cout.setf(std::ios::fixed); cout << setprecision(8);
59	cout << "True E[e^Z] = exp(1/2) ≈ 1.64872127\n";
60	cout << "Plain MC: mean = " << (double)mean_plain << ", SE ≈ " << (double)se_plain << "\n";
61	cout << "Antithetic: mean = " << (double)mean_anti << ", SE ≈ " << (double)se_anti << "\n";
62	cout << "SE reduction factor (plain/anti): " << (double)(se_plain / se_anti) << "\n";
63
64	return 0;
65	}
66

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Estimate a rare-event probability p = P(Z > 5) where Z ~ N(0,1).
5	// Naive Monte Carlo would require ~1/p samples; here we use importance sampling with Q = N(mu,1), mu=5.
6	// Weight: w(y) = p(y)/q(y) = exp(-muy + 0.5mu*mu), computed in log-domain for stability.
7
8	int main() {
9	ios::sync_with_stdio(false);
10	cin.tie(nullptr);
11
12	const size_t N = 500'000; // number of importance samples
13	const double mu = 5.0; // shift toward the rare event region
14
15	std::mt19937_64 rng(2024ULL);
16	std::normal_distribution<double> q_dist(mu, 1.0); // proposal Q = N(mu,1)
17	std::normal_distribution<double> p_dist(0.0, 1.0); // for naive reference
18
19	// Importance sampling estimator
20	long double mean_is = 0.0L, M2_is = 0.0L; // online stats for weighted indicator
21
22	for (size_t i = 1; i <= N; ++i) {
23	double y = q_dist(rng);
24	// log w = -mu * y + 0.5 * mu^2 (derived from Gaussian density ratio)
25	long double logw = -mu * (long double)y + 0.5L * mu * mu;
26	long double w = expl(logw);
27	long double h = (y > 5.0) ? w : 0.0L; // indicator * weight
28
29	long double delta = h - mean_is;
30	mean_is += delta / (long double)i;
31	long double delta2 = h - mean_is;
32	M2_is += delta * delta2;
33	}
34	long double var_is = M2_is / (long double)(N - 1);
35	long double se_is = sqrt((double)var_is / (double)N);
36
37	// Naive MC for comparison (same N). Likely returns zero due to rarity.
38	size_t count_naive = 0;
39	for (size_t i = 0; i < N; ++i) {
40	double z = p_dist(rng);
41	if (z > 5.0) ++count_naive;
42	}
43	long double est_naive = (long double)count_naive / (long double)N;
44
45	cout.setf(std::ios::fixed); cout << setprecision(12);
46	cout << "Importance sampling estimate P(Z>5): " << (double)mean_is << "\n";
47	cout << "Standard error (IS): " << (double)se_is << "\n";
48	cout << "Naive MC estimate (same N): " << (double)est_naive << "\n";
49	cout << "Reference true value ≈ 2.867e-7 (for N(0,1))\n";
50
51	return 0;
52	}
53