🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
∑MathIntermediate

Confidence Intervals & Prediction Intervals

Key Points

  • •
    A confidence interval estimates a fixed but unknown parameter (like a population mean) with a range that would capture the true value in a long run of repeated samples.
  • •
    A prediction interval estimates the range where a future individual observation will fall, which is wider because it includes both parameter uncertainty and natural randomness.
  • •
    For means with unknown variance and normal data, confidence intervals use the t-distribution with n−1 degrees of freedom.
  • •
    Prediction intervals for a single future observation add an extra 1 under the square root to account for observation noise.
  • •
    In linear regression, you can compute intervals for the mean response at x0 and a wider prediction interval for a new y at x0.
  • •
    Frequentist confidence is about long-run coverage, not the probability that a specific computed interval contains the parameter.
  • •
    Always check assumptions: independence, approximate normality (or large n for CLT), and correct model specification.
  • •
    C++ implementations require numerical routines for quantiles; you can use an accurate normal inverse and a Cornish–Fisher expansion to approximate t-quantiles.

Prerequisites

  • →Basic Probability and Distributions — Understanding normal and t-distributions, expectations, and variance is essential for constructing intervals.
  • →Descriptive Statistics — You must compute means, variances, and standard deviations to plug into interval formulas.
  • →Central Limit Theorem — Explains why normal approximations for the mean are valid as sample sizes grow.
  • →Simple Linear Regression — Needed for understanding intervals for the mean response and prediction in a regression context.
  • →Numerical Methods for Quantiles — Implementing critical values in C++ requires inverse CDF approximations or numerical solvers.

Detailed Explanation

Tap terms for definitions

01Overview

Imagine you measure the heights of 30 plants to learn about the average height in the whole greenhouse. You know your sample average, but it is only an estimate. A confidence interval (CI) wraps that estimate with a margin of error to indicate a plausible range for the true average height. If you repeated the experiment many times, most of those computed intervals would cover the true parameter at the advertised rate (for example, 95%). Now, suppose you want to forecast the height of the next plant you will measure. A prediction interval (PI) answers that different question by providing a range in which a single future observation is likely to fall; it is wider because it must also reflect the plant-to-plant variability. Conceptually, confidence intervals quantify uncertainty about a fixed but unknown parameter (like a mean, difference of means, slope), while prediction intervals quantify uncertainty about new data points. Mathematically, these intervals are built using sampling distributions. For means, we use the normal or t-distribution depending on whether the population variance is known. In regression, we use the fitted model, residual variance, and leverage at the prediction point to construct intervals. In practice, you build an interval by taking an estimate ± a critical value times a standard error. The key is to choose the right standard error and distribution so that the interval has the desired long-run coverage.

02Intuition & Analogies

Think of throwing darts at a hidden bullseye. The bullseye is the true parameter (like the real average height in the greenhouse), and your dart throws are sample estimates that jiggle around due to random sampling. A confidence interval is like drawing a circle around where your dart landed, sized so that, if you kept throwing darts and redrawing circles, a fixed percentage of those circles would cover the bullseye. Crucially, the bullseye does not move; your circles do. That’s why we say a 95% confidence interval has 95% coverage in repeated sampling—not that the parameter is 95% probable to be in the particular circle you drew. A prediction interval is more like forecasting where the next dart will land relative to the bullseye. Even if you knew the bullseye exactly, the next dart will still scatter due to randomness. So for prediction we combine two uncertainties: (1) we don’t know the bullseye’s location precisely (estimation uncertainty), and (2) throws are inherently variable (process noise). This is why prediction intervals are wider than confidence intervals based on the same data. In linear regression, picture a best-fit line through a cloud of points. A CI for the mean response at some x0 is a narrow band around the line, reflecting uncertainty about the line’s position. A PI for a new point at x0 is a wider band, reflecting both uncertainty about the line and the fact that new points deviate from the line due to residual scatter. Farther from the center of your observed x-values, leverage increases and both bands widen—just as you’d expect when extrapolating.

03Formal Definition

Let X1​,…,Xn​ be i.i.d. observations from a distribution parameterized by θ. A 100(1-α)% confidence interval (CI) for θ is a random interval C(X1:n​) such that Pθ​(θ ∈ C(X1:n​)) ≥ 1-α for all θ, where the probability is over the sampling distribution of the data. For a normal mean with unknown variance, the classic CI is Xˉ ± t1−α/2,n−1​ ⋅ S/n​, where S is the sample standard deviation and t1−α/2,ν​ is the (1-α/2) quantile of Student’s t with ν degrees of freedom. A 100(1-α)% prediction interval (PI) for a future observation X_{new} is a random interval P(X1:n​) such that Pθ​(X_{new} ∈ P(X1:n​)) ≥ 1-α. Under normality with unknown variance, a PI for a single future observation is Xˉ ± t1−α/2,n−1​ ⋅ S 1+1/n​. In simple linear regression Y = β0​ + β1​x + ε with ε ∼ N(0,σ2), the CI for the mean response at x0​ is y^​(x0​) ± t1−α/2,n−2​ ⋅ σ^ \sqrt{n1​ + ∑(xi​−xˉ)2(x0​−xˉ)2​}, while the PI for a new response at x0​ is y^​(x0​) ± t1−α/2,n−2​ ⋅ σ^ \sqrt{1 + n1​ + ∑(xi​−xˉ)2(x0​−xˉ)2​}. These intervals achieve nominal coverage asymptotically under mild conditions (e.g., by the Central Limit Theorem) or exactly under normality with known forms. The essential building blocks are an estimator, its standard error, and a pivotal quantity whose distribution does not depend on unknown parameters.

04When to Use

Use a confidence interval when your goal is to quantify uncertainty about a fixed but unknown parameter: population mean, difference of means, regression slope, or a proportion. For example, after running an A/B test, you may build a CI for the difference in conversion rates to decide if the effect is practically significant. In scientific measurement, CIs help report uncertainty around estimates of physical constants or treatment effects. Use a prediction interval when your goal is to forecast a single future observation or a small number of future values: the next temperature reading, tomorrow’s sales, or the next latency measurement in a system. In regression, select a CI if you care about the expected value E[Y|X=x_{0}] (e.g., average fuel efficiency at a speed), and a PI if you care about an actual new Y at x_{0} (e.g., the next car’s efficiency), which varies more around the mean. If sample sizes are small and the population variance is unknown but data are approximately normal, prefer t-based intervals. For large samples, normal approximations generally suffice (by the CLT). In linear models, ensure the model is appropriate (linearity, homoscedasticity, independent errors) before trusting the intervals. When making multiple intervals, consider adjustments (e.g., Bonferroni) to control family-wise error.

⚠️Common Mistakes

Misinterpreting confidence: saying “there is a 95% probability the true mean lies in this computed interval” is incorrect in the frequentist framework. The correct statement is about long-run coverage across repeated samples. Another mistake is using a CI when you actually need a PI—forecasts for individual outcomes will be too narrow if you report a CI for the mean. Ignoring assumptions is common. t-intervals assume approximate normality of the sampling distribution of the mean; with small n and heavy tails or strong skew, coverage can be poor. In regression, failing to check linearity, constant variance, or independence can lead to misleading intervals. Extrapolating too far beyond the observed x-range inflates error due to high leverage. Using the wrong standard error or degrees of freedom is another trap: for unknown variance use S and df = n−1 (or n−p in regression), not the population \sigma. When sample sizes are tiny, blindly applying asymptotic normal intervals is risky; consider exact or nonparametric methods. Finally, reporting just the interval without the context (assumptions, sample size, and method) can mislead stakeholders; always accompany intervals with method notes and diagnostics.

Key Formulas

Sample Mean

Xˉ=n1​i=1∑n​Xi​

Explanation: The sample mean is the average of observed values and serves as an estimator for the population mean. It is the center point for many confidence and prediction intervals.

Sample Variance

S2=n−11​i=1∑n​(Xi​−Xˉ)2

Explanation: This is the unbiased estimator of the population variance. It measures variability of observations around the sample mean.

Z-interval (known variance)

CIμ​: Xˉ±z1−α/2​⋅n​σ​

Explanation: When the population variance is known and data are normal (or n is large), the confidence interval for the mean uses the standard normal critical value. The width shrinks as n increases.

t-interval (unknown variance)

CIμ​: Xˉ±t1−α/2,n−1​⋅n​S​

Explanation: When variance is unknown, replace σ by S and use a t critical value with n−1 degrees of freedom. This yields exact coverage under normality.

One-sample prediction interval

PIone future​: Xˉ±t1−α/2,n−1​⋅S1+n1​​

Explanation: A future observation deviates from the mean due to both estimation error and inherent noise. The extra 1 inside the square root accounts for observation noise.

OLS estimates (simple regression)

β^​1​=∑(xi​−xˉ)2∑(xi​−xˉ)(yi​−yˉ​)​,β^​0​=yˉ​−β^​1​xˉ

Explanation: These formulas compute the slope and intercept of the best-fit line in least squares. They are used to form predictions and intervals in regression.

Residual variance (simple regression)

σ^2=n−21​i=1∑n​(yi​−β^​0​−β^​1​xi​)2

Explanation: This estimates the error variance in simple linear regression using n−2 degrees of freedom. It drives the width of regression intervals.

Regression CI for mean response

CIE[Y∣x0​]​: y^​0​±t1−α/2,n−2​⋅σ^n1​+∑(xi​−xˉ)2(x0​−xˉ)2​​

Explanation: This interval quantifies uncertainty about the expected response at x0. It narrows with more data and widens farther from the center of x.

Regression prediction interval

PIYnew​∣x0​​: y^​0​±t1−α/2,n−2​⋅σ^1+n1​+∑(xi​−xˉ)2(x0​−xˉ)2​​

Explanation: This covers a single new observation at x0 and is wider than the CI because it includes the observation noise term (the leading 1 under the root).

Coverage definition

Pθ​(θ∈C(X1:n​))=1−α

Explanation: By design, a 100(1−α) confidence procedure satisfies this coverage probability across repeated sampling. It is a property of the method, not of a specific realized interval.

Complexity Analysis

Computing confidence and prediction intervals in the one-sample mean setting requires simple summary statistics: the sample mean and sample variance. These can be obtained in a single pass over n data points, yielding O(n) time and O(1) auxiliary space. The critical value lookup (z or t quantile) is effectively O(1) assuming a constant-time approximation or a small, bounded number of iterations in a numerical solver. Thus, the overall complexity is dominated by scanning the data. In simple linear regression, computing the slope, intercept, and residual variance requires basic sums: ∑ xi​, ∑ yi​, ∑ xi​^2, ∑ xi​ yi​, and ∑ yi​^2. Each sum aggregates in a single pass, so fitting the model and subsequently forming intervals for any fixed x0 is O(n) time with O(1) extra space. Once the model is fit, computing an interval at each new x0 is O(1), making batch predictions efficient after an initial O(n) fit. If you need intervals at many x0 values, the amortized cost per interval remains O(1) as long as you reuse precomputed statistics (xˉ, Sxx​, and σ^). From a numerical standpoint, the most delicate operation is evaluating distribution quantiles. Using a high-quality inverse normal CDF (e.g., Acklam’s rational approximation) runs in constant time with machine-precision accuracy. t-quantiles can be approximated from normal quantiles via a Cornish–Fisher expansion with error diminishing as degrees of freedom grow; this also runs in constant time per query. Therefore, for typical statistical workloads in C++, both CI and PI computations are linear in the number of data points with negligible constant overhead for quantile evaluation and arithmetic.

Code Examples

One-sample t Confidence Interval and One-step-ahead Prediction Interval
1#include <iostream>
2#include <vector>
3#include <cmath>
4#include <stdexcept>
5#include <numeric>
6#include <algorithm>
7
8// Inverse standard normal CDF (Acklam's approximation)
9// Returns z such that Phi(z) = p for 0 < p < 1
10static double inv_norm_cdf(double p) {
11 if (p <= 0.0 || p >= 1.0) throw std::invalid_argument("p must be in (0,1)");
12 // Coefficients for Acklam's approximation
13 const double a1 = -3.969683028665376e+01;
14 const double a2 = 2.209460984245205e+02;
15 const double a3 = -2.759285104469687e+02;
16 const double a4 = 1.383577518672690e+02;
17 const double a5 = -3.066479806614716e+01;
18 const double a6 = 2.506628277459239e+00;
19
20 const double b1 = -5.447609879822406e+01;
21 const double b2 = 1.615858368580409e+02;
22 const double b3 = -1.556989798598866e+02;
23 const double b4 = 6.680131188771972e+01;
24 const double b5 = -1.328068155288572e+01;
25
26 const double c1 = -7.784894002430293e-03;
27 const double c2 = -3.223964580411365e-01;
28 const double c3 = -2.400758277161838e+00;
29 const double c4 = -2.549732539343734e+00;
30 const double c5 = 4.374664141464968e+00;
31 const double c6 = 2.938163982698783e+00;
32
33 const double d1 = 7.784695709041462e-03;
34 const double d2 = 3.224671290700398e-01;
35 const double d3 = 2.445134137142996e+00;
36 const double d4 = 3.754408661907416e+00;
37
38 const double plow = 0.02425;
39 const double phigh = 1.0 - plow;
40
41 double q, r;
42 if (p < plow) {
43 // Rational approximation for lower region
44 q = std::sqrt(-2.0 * std::log(p));
45 return (((((c1*q + c2)*q + c3)*q + c4)*q + c5)*q + c6) /
46 ((((d1*q + d2)*q + d3)*q + d4)*q + 1.0);
47 } else if (phigh < p) {
48 // Rational approximation for upper region
49 q = std::sqrt(-2.0 * std::log(1.0 - p));
50 return -(((((c1*q + c2)*q + c3)*q + c4)*q + c5)*q + c6) /
51 ((((d1*q + d2)*q + d3)*q + d4)*q + 1.0);
52 } else {
53 // Rational approximation for central region
54 q = p - 0.5;
55 r = q * q;
56 return (((((a1*r + a2)*r + a3)*r + a4)*r + a5)*r + a6) * q /
57 (((((b1*r + b2)*r + b3)*r + b4)*r + b5)*r + 1.0);
58 }
59}
60
61// Approximate t-quantile using Cornish-Fisher expansion around normal
62// Returns t such that F_t(t; df) = p for df >= 2 approximately
63static double t_quantile(double p, double df) {
64 if (df <= 1.5) throw std::invalid_argument("degrees of freedom must be > 1.5");
65 double z = inv_norm_cdf(p);
66 // Cornish-Fisher expansion terms for Student's t (up to 1/df^2)
67 double z3 = z*z*z;
68 double z5 = z3*z*z;
69 double z7 = z5*z*z;
70 double a = (z3 + z) / (4.0*df);
71 double b = (5.0*z5 + 16.0*z3 + 3.0*z) / (96.0*df*df);
72 double c = (3.0*z7 + 19.0*z5 + 17.0*z3 - 15.0*z) / (384.0*df*df*df); // optional higher-order
73 return z + a + b + c; // accurate for df >= ~5; still reasonable for smaller df
74}
75
76struct OneSampleIntervals {
77 double mean;
78 double s;
79 double ci_low;
80 double ci_high;
81 double pi_low;
82 double pi_high;
83};
84
85OneSampleIntervals compute_one_sample_intervals(const std::vector<double>& x, double alpha = 0.05) {
86 size_t n = x.size();
87 if (n < 2) throw std::invalid_argument("Need at least 2 observations");
88
89 // Compute mean
90 double sum = std::accumulate(x.begin(), x.end(), 0.0);
91 double mean = sum / static_cast<double>(n);
92
93 // Compute unbiased sample variance
94 double s2 = 0.0;
95 for (double xi : x) {
96 double d = xi - mean;
97 s2 += d * d;
98 }
99 s2 /= static_cast<double>(n - 1);
100 double s = std::sqrt(s2);
101
102 // t critical for two-sided (1 - alpha) interval
103 double tcrit = t_quantile(1.0 - alpha/2.0, static_cast<double>(n - 1));
104
105 // Confidence interval for mean
106 double se_mean = s / std::sqrt(static_cast<double>(n));
107 double ci_low = mean - tcrit * se_mean;
108 double ci_high = mean + tcrit * se_mean;
109
110 // Prediction interval for one future observation
111 double se_pred = s * std::sqrt(1.0 + 1.0/static_cast<double>(n));
112 double pi_low = mean - tcrit * se_pred;
113 double pi_high = mean + tcrit * se_pred;
114
115 return {mean, s, ci_low, ci_high, pi_low, pi_high};
116}
117
118int main() {
119 // Example data: measured response times (ms)
120 std::vector<double> data = {102, 98, 105, 110, 95, 101, 99, 107, 103, 100};
121 double alpha = 0.05; // 95% intervals
122
123 try {
124 auto res = compute_one_sample_intervals(data, alpha);
125 std::cout << "n = " << data.size() << "\n";
126 std::cout << "Sample mean = " << res.mean << ", sample s = " << res.s << "\n";
127 std::cout << "95% CI for mean: [" << res.ci_low << ", " << res.ci_high << "]\n";
128 std::cout << "95% PI for next observation: [" << res.pi_low << ", " << res.pi_high << "]\n";
129 } catch (const std::exception& e) {
130 std::cerr << "Error: " << e.what() << "\n";
131 return 1;
132 }
133 return 0;
134}
135

This program computes a one-sample t confidence interval for the population mean and a prediction interval for a single future observation under normality with unknown variance. It first calculates the sample mean and unbiased sample variance, then uses an accurate inverse normal CDF and a Cornish–Fisher expansion to approximate the t critical value. The CI uses S/√n as the standard error, while the PI uses S·√(1 + 1/n), making the prediction interval wider to reflect observation noise.

Time: O(n)Space: O(1)
Simple Linear Regression: CI for Mean Response and PI for New Observation at x0
1#include <iostream>
2#include <vector>
3#include <cmath>
4#include <stdexcept>
5#include <numeric>
6#include <algorithm>
7
8// Inverse standard normal CDF (Acklam's approximation)
9static double inv_norm_cdf(double p) {
10 if (p <= 0.0 || p >= 1.0) throw std::invalid_argument("p must be in (0,1)");
11 const double a1 = -3.969683028665376e+01;
12 const double a2 = 2.209460984245205e+02;
13 const double a3 = -2.759285104469687e+02;
14 const double a4 = 1.383577518672690e+02;
15 const double a5 = -3.066479806614716e+01;
16 const double a6 = 2.506628277459239e+00;
17
18 const double b1 = -5.447609879822406e+01;
19 const double b2 = 1.615858368580409e+02;
20 const double b3 = -1.556989798598866e+02;
21 const double b4 = 6.680131188771972e+01;
22 const double b5 = -1.328068155288572e+01;
23
24 const double c1 = -7.784894002430293e-03;
25 const double c2 = -3.223964580411365e-01;
26 const double c3 = -2.400758277161838e+00;
27 const double c4 = -2.549732539343734e+00;
28 const double c5 = 4.374664141464968e+00;
29 const double c6 = 2.938163982698783e+00;
30
31 const double d1 = 7.784695709041462e-03;
32 const double d2 = 3.224671290700398e-01;
33 const double d3 = 2.445134137142996e+00;
34 const double d4 = 3.754408661907416e+00;
35
36 const double plow = 0.02425;
37 const double phigh = 1.0 - plow;
38
39 double q, r;
40 if (p < plow) {
41 q = std::sqrt(-2.0 * std::log(p));
42 return (((((c1*q + c2)*q + c3)*q + c4)*q + c5)*q + c6) /
43 ((((d1*q + d2)*q + d3)*q + d4)*q + 1.0);
44 } else if (phigh < p) {
45 q = std::sqrt(-2.0 * std::log(1.0 - p));
46 return -(((((c1*q + c2)*q + c3)*q + c4)*q + c5)*q + c6) /
47 ((((d1*q + d2)*q + d3)*q + d4)*q + 1.0);
48 } else {
49 q = p - 0.5;
50 r = q * q;
51 return (((((a1*r + a2)*r + a3)*r + a4)*r + a5)*r + a6) * q /
52 (((((b1*r + b2)*r + b3)*r + b4)*r + b5)*r + 1.0);
53 }
54}
55
56// Approximate t-quantile via Cornish-Fisher
57static double t_quantile(double p, double df) {
58 if (df <= 1.5) throw std::invalid_argument("degrees of freedom must be > 1.5");
59 double z = inv_norm_cdf(p);
60 double z3 = z*z*z;
61 double z5 = z3*z*z;
62 double z7 = z5*z*z;
63 double a = (z3 + z) / (4.0*df);
64 double b = (5.0*z5 + 16.0*z3 + 3.0*z) / (96.0*df*df);
65 double c = (3.0*z7 + 19.0*z5 + 17.0*z3 - 15.0*z) / (384.0*df*df*df);
66 return z + a + b + c;
67}
68
69struct RegressionIntervals {
70 double beta0;
71 double beta1;
72 double sigma;
73 double mean_ci_low;
74 double mean_ci_high;
75 double pred_pi_low;
76 double pred_pi_high;
77};
78
79RegressionIntervals compute_regression_intervals(const std::vector<double>& x,
80 const std::vector<double>& y,
81 double x0,
82 double alpha = 0.05) {
83 size_t n = x.size();
84 if (n != y.size() || n < 3) throw std::invalid_argument("Need n >= 3 and matching x,y sizes");
85
86 double sx = std::accumulate(x.begin(), x.end(), 0.0);
87 double sy = std::accumulate(y.begin(), y.end(), 0.0);
88 double sxx = 0.0, sxy = 0.0;
89 for (size_t i = 0; i < n; ++i) {
90 sxx += x[i] * x[i];
91 sxy += x[i] * y[i];
92 }
93 double xbar = sx / n;
94 double ybar = sy / n;
95
96 double Sxx = 0.0, Sxy = 0.0;
97 for (size_t i = 0; i < n; ++i) {
98 Sxx += (x[i] - xbar) * (x[i] - xbar);
99 Sxy += (x[i] - xbar) * (y[i] - ybar);
100 }
101 if (Sxx == 0.0) throw std::runtime_error("All x values are identical; cannot fit slope");
102
103 double beta1 = Sxy / Sxx;
104 double beta0 = ybar - beta1 * xbar;
105
106 // Residual variance estimate
107 double rss = 0.0;
108 for (size_t i = 0; i < n; ++i) {
109 double e = y[i] - (beta0 + beta1 * x[i]);
110 rss += e * e;
111 }
112 double sigma = std::sqrt(rss / static_cast<double>(n - 2));
113
114 // Prediction at x0
115 double yhat0 = beta0 + beta1 * x0;
116
117 // t critical with df = n - 2
118 double tcrit = t_quantile(1.0 - alpha/2.0, static_cast<double>(n - 2));
119
120 // Standard errors
121 double se_mean = sigma * std::sqrt( (1.0/n) + ((x0 - xbar)*(x0 - xbar))/Sxx );
122 double se_pred = sigma * std::sqrt( 1.0 + (1.0/n) + ((x0 - xbar)*(x0 - xbar))/Sxx );
123
124 RegressionIntervals out;
125 out.beta0 = beta0;
126 out.beta1 = beta1;
127 out.sigma = sigma;
128 out.mean_ci_low = yhat0 - tcrit * se_mean;
129 out.mean_ci_high = yhat0 + tcrit * se_mean;
130 out.pred_pi_low = yhat0 - tcrit * se_pred;
131 out.pred_pi_high = yhat0 + tcrit * se_pred;
132 return out;
133}
134
135int main() {
136 // Example data: x = engine size (liters), y = fuel consumption (L/100km)
137 std::vector<double> x = {1.2, 1.6, 2.0, 2.4, 3.0, 3.2, 3.6};
138 std::vector<double> y = {5.8, 6.2, 7.0, 7.5, 8.5, 9.0, 9.4};
139
140 double x0 = 2.5; // Predict at this engine size
141 double alpha = 0.05; // 95% intervals
142
143 try {
144 auto res = compute_regression_intervals(x, y, x0, alpha);
145 std::cout << "Fit: y = " << res.beta0 << " + " << res.beta1 << " x\n";
146 std::cout << "Residual sigma = " << res.sigma << "\n";
147 std::cout << "95% CI for mean at x0: [" << res.mean_ci_low << ", " << res.mean_ci_high << "]\n";
148 std::cout << "95% PI for new y at x0: [" << res.pred_pi_low << ", " << res.pred_pi_high << "]\n";
149 } catch (const std::exception& e) {
150 std::cerr << "Error: " << e.what() << "\n";
151 return 1;
152 }
153
154 return 0;
155}
156

This program fits a simple linear regression by computing slope and intercept from summary sums, then estimates the residual standard deviation. It uses a t critical value (via a Cornish–Fisher approximation) to construct a confidence interval for the mean response at x0 and a wider prediction interval for a new observation at x0. Farther from the average x, both intervals widen due to the leverage term (x0−x̄)^2/Sxx.

Time: O(n)Space: O(1)
#confidence interval#prediction interval#t distribution#standard error#linear regression#coverage probability#critical value#cornish fisher#inverse normal cdf#central limit theorem#leverage#frequentist inference#residual variance#mean estimation#forecasting