∑MathIntermediate

Taylor Series & Approximation

Key Points

•
Taylor series approximate a complicated function near a point by a simple polynomial built from its derivatives.
•
The first few terms give the familiar linear (tangent) and quadratic (curvature) approximations used in physics and engineering.
•
The approximation error is controlled by a remainder term that typically shrinks like a high power of the step h when h is small.
•
Maclaurin series are Taylor series centered at 0 and include classic expansions for $e^{x}$ , sin x, and cos x.
•
In code, we evaluate Taylor polynomials efficiently with Horner’s method and incremental terms to avoid expensive pow and factorial calls.
•
Taylor ideas power many numerical methods such as finite differences, numerical differentiation, and fast approximations of transcendental functions.
•
Convergence depends on analyticity and distance to the nearest singularity; far from the center the series can fail or converge very slowly.
•
Practical implementations must balance truncation error (too few terms) and round-off error (too many terms) in floating-point arithmetic.

Prerequisites

→Limits and continuity — Taylor series rely on behavior as h → 0 and require continuity for remainder bounds.
→Derivatives and higher-order derivatives — Coefficients of the Taylor polynomial are built from derivatives at the center.
→Factorials and binomial coefficients — Taylor coefficients use k! and generalized binomials for series like (1+x)^\alpha.
→Big-O notation — To interpret remainder behavior as O(h^{n+1}) and compare error terms.
→Power series and convergence — Understanding when infinite Taylor series converge to the function.
→Floating-point arithmetic — Implementation accuracy depends on round-off, cancellation, and overflow/underflow risks.
→Basic C++ programming — To implement loops, functions, and numerical computations safely and efficiently.

Detailed Explanation

Tap terms for definitions

01Overview

A Taylor series is a way to locally replace a complicated function with a polynomial that matches its value, slope, curvature, and higher derivatives at a chosen center point. Think of taking a curve and zooming in: very close to a point, the curve looks almost like a straight line (the tangent). If you zoom a little less, the curve’s bend (curvature) matters, so a quadratic fits better. Continue this process, and you build a polynomial whose coefficients are the function’s derivatives at the center divided by factorials. This idea underlies countless techniques in science and engineering because polynomials are easy to compute and reason about. In practice, we rarely use infinitely many terms; instead, we truncate after n terms to get a Taylor polynomial, accepting a small remainder error that typically scales as a high power of the distance from the center. For well-behaved (analytic) functions, the full infinite Taylor series converges to the function within some radius around the center (the radius of convergence). Computationally, Taylor expansions let us approximate values of expensive functions (e.g., e^x, sin x) and derive accurate finite-difference formulas for derivatives. The key to safe use is understanding how many terms to keep, where the series converges, and how to bound the remainder.

02Intuition & Analogies

Imagine you’re describing a hill to a friend standing at a particular spot. First, you say the current elevation (function value). Next, you describe how steep it is and in what direction (slope). If you want to be more accurate, you add how the slope is changing as you move (curvature). Each of these pieces lets your friend reconstruct a better local picture of the hill around that spot. A Taylor series does exactly this: it encodes the local behavior of a function around a center by stacking value, slope, curvature, and higher “bends” (higher derivatives). The distance you move from the center is like the step h. For a tiny h, just the first term or two may be enough; for a larger h, you need more detail (more terms). In everyday life, we already use the first term: when we say “for small changes, the output changes roughly slope × change,” we’re using the linear (first-order) Taylor approximation. The second-order term adds curvature, much like how a car’s path feels different on a straight road (no curvature) vs. a bend (curvature changes direction). As you add higher orders, you refine the local picture. But this zoom metaphor has limits: if you try to describe a distant cliff using only near-you information, your story breaks down. That’s the radius of convergence: stray too far from where you measured, and the description stops matching reality. In computation, this intuition tells us to center expansions near where we evaluate and to stop adding terms once the new details are smaller than numerical noise.

03Formal Definition

Let f be a function with n+1 continuous derivatives on an interval containing a (and x). The degree-n Taylor polynomial of f about a is

T_{n}

(x) =

\sum_{k = 0}^{n}

\frac{f ^{(k)} ( a )}{k !}

(x - a)^{k}. The error (remainder) has the Lagrange form

R_{n}

(x) =

\frac{f ^{(n + 1)} ( ξ )}{( n + 1 )!}

(x - a)^{n+1} for some

ξ

between a and x. This implies

∣ R_{n} (x) ∣

\leq

\frac{∣ x - a ∣ ^{n + 1}}{( n + 1 )!}

where M bounds

∣ f^{(n + 1)} ∣

on the interval, giving a practical error bound. If f is analytic at a (has a convergent power series representation), then the infinite series f(x) =

\sum_{k = 0}^{\infty}

\frac{f ^{(k)} ( a )}{k !}

(x-a)^k converges to f on an interval (a - R, a + R), where R is the radius of convergence. Classic special cases include Maclaurin series (

a = 0

), such as

e^{x}

\sum_{k \geq 0}

x^{k}

/k! and

sin

x =

\sum_{m \geq 0}

(-1)^{m}

x^{2 m + 1}

/(2m+1)!. In asymptotic notation as x

\to

a, the remainder satisfies

R_{n}

(x) = O((x-a)^{n+1}), meaning it shrinks proportionally to the (n+1)-st power of the step size.

04When to Use

Use Taylor approximations when you need fast, local estimates of a function and you know (or can compute) derivatives at a nearby point. Typical scenarios: (1) Fast function evaluation: approximate e^x, sin x, cos x near a center to avoid costly library calls in tight loops (often with range reduction to keep |x-a| small). (2) Error analysis and algorithm design: derive finite-difference formulas like f'(x) \approx (f(x+h)-f(x-h))/(2h) via Taylor expansions to understand accuracy. (3) Physics and engineering: small-angle approximations (\sin \theta \approx \theta), perturbation methods, and linearization of nonlinear systems for control and stability analysis. (4) Root finding and optimization: Newton’s method uses a first-order Taylor model to step toward roots or minima. (5) Differential equations: series methods and local truncation error estimates in numerical integrators (e.g., Runge–Kutta order conditions are Taylor-based). Prefer Taylor when the evaluation point is close to the expansion center and the function is smooth (analytic) there. Be cautious or avoid Taylor when the function has singularities or sharp features near your region (small radius of convergence), or when |x-a| is large, which can demand many terms and amplify rounding errors. In such cases, re-center, reduce the argument, or switch methods (rational approximations, Chebyshev polynomials).

⚠️Common Mistakes

Ignoring the remainder: Stopping after a few terms without checking an error bound can yield misleading results. Remedy: use Lagrange’s bound or an alternating-series bound when applicable, or estimate with the first omitted term if justified. - Confusing Maclaurin with Taylor: Expanding at 0 (Maclaurin) can be poor if you evaluate far from 0. Remedy: expand around a point near your x, possibly after range reduction. - Using too many terms blindly: In floating-point, very high-degree polynomials can suffer catastrophic cancellation and overflow in factorials. Remedy: evaluate with Horner’s method and incremental terms; stop when added terms no longer change the sum. - Wrong units for trig: Using degrees in formulas derived for radians (e.g., \sin x \approx x) leads to large errors. Remedy: always use radians in Taylor expansions for trig. - Exceeding radius of convergence: Applying a series outside its convergence interval (e.g., binomial series for |x| \ge 1) can diverge. Remedy: check convergence conditions or change the center. - Naive pow/factorial computation: Recomputing x^k and k! each step is slow and numerically risky. Remedy: update terms multiplicatively: term_{k} = term_{k-1} * (x-a)/k. - Misinterpreting Big-O: O(h^{n+1}) describes asymptotic behavior as h \to 0, not a concrete bound for a fixed h. Remedy: use Lagrange’s form for concrete bounds when possible.

Key Formulas

Taylor Series (general)

f (x) = k = 0 \sum \infty \frac{f ^{(k)} ( a )}{k !} (x - a)^{k}

Explanation: This expresses a function as an infinite polynomial around a point a. If the series converges to f, it provides exact values; truncations give approximations.

Taylor Polynomial (degree n)

T_{n} (x) = k = 0 \sum n \frac{f ^{(k)} ( a )}{k !} (x - a)^{k}

Explanation: This is the n-term truncation used in practice. It matches the first n derivatives of f at x=a.

Lagrange Remainder

R_{n} (x) = \frac{f ^{(n + 1)} ( ξ )}{( n + 1 )!} (x - a)^{n + 1}, ξ \in (a, x)

Explanation: The exact error after keeping n terms equals the (n+1)-st derivative at some intermediate point times (x-a)^{n+1}/(n+1)!. It enables concrete error bounds.

Asymptotic Remainder

R_{n} (x) = O ((x - a)^{n + 1}) as x \to a

Explanation: As the step size $h = x - a$ tends to 0, the error shrinks proportionally to $h^{n + 1}$ . This summarizes the rate at which the approximation improves.

Maclaurin Series

f (x) = k = 0 \sum \infty \frac{f ^{(k)} ( 0 )}{k !} x^{k}

Explanation: This is the special case centered at 0. Many classic expansions (e.g., for $e^{x}$ , sin x) are Maclaurin series.

Exponential Series

e^{x} = k = 0 \sum \infty \frac{x ^{k}}{k !}

Explanation: The exponential’s series converges for all real x and is often used as a benchmark for series methods.

Sine Series

sin x = m = 0 \sum \infty (- 1)^{m} \frac{x ^{2 m + 1}}{( 2 m + 1 )!}

Explanation: This alternating series contains only odd powers and converges for all real x. The alternating nature allows tight error bounds from the next term.

Cosine Series

cos x = m = 0 \sum \infty (- 1)^{m} \frac{x ^{2 m}}{( 2 m )!}

Explanation: This alternating series contains only even powers and converges for all real x. Like sine, its remainder is bounded by the next term under standard conditions.

Central Difference (order 2)

f^{'} (x) = \frac{f ( x + h ) - f ( x - h )}{2 h} + O (h^{2})

Explanation: Derived from Taylor expansions at x±h, this provides a second-order accurate numerical derivative for small h.

Richardson Extrapolation for Derivatives

f^{'} (x) = \frac{4 D ( h /2 ) - D ( h )}{3} + O (h^{4}), D (h) = \frac{f ( x + h ) - f ( x - h )}{2 h}

Explanation: By combining central differences at h and h/2, the leading O( $h^{2}$ ) error cancels, yielding a fourth-order accurate estimate.

Cauchy–Hadamard Radius

R^{- 1} = k \to \infty lim sup k ∣ a_{k} ∣, k = 0 \sum \infty a_{k} (x - a)^{k}

Explanation: This gives the radius of convergence R for a power series from the growth rate of its coefficients. Taylor series converge where |x-a| < R.

Alternating Series Error

∣ R_{n} ∣ \leq ∣ a_{n + 1} ∣ (alternating, decreasing terms)

Explanation: For an alternating series with monotonically decreasing term magnitudes, the error is at most the first omitted term. Useful for sin and cos near 0.

Generalized Binomial Series

(1 + x)^{α} = k = 0 \sum \infty (k α) x^{k}, (k α) = \frac{α ( α - 1 ) \dots ( α - k + 1 )}{k !}, ∣ x ∣ < 1

Explanation: A classic Taylor series for non-integer exponents, valid for |x|<1. It’s widely used for small-perturbation roots and reciprocals.

Horner Form of Polynomial

T_{n} (x) = c_{0} + (x - a) (c_{1} + (x - a) (c_{2} + \dots + (x - a) c_{n}))

Explanation: Nested evaluation minimizes multiplications and improves numerical stability when computing Taylor polynomials.

Complexity Analysis

Evaluating an n-term Taylor polynomial naïvely with pow and factorial in each term costs O(n) calls to pow (often implemented with O(log k) multiplications) and O(n) factorial computations, which is slow and can overflow. The right approach is incremental terms or Horner’s method: starting from

t er m_{0} = 1

and updating

t er m_{k} = t er m_{k - 1} \cdot (x - a

)/k yields O(n) scalar multiplications/divisions with O(1) extra space. Thus, a single evaluation is O(n) time and O(1) space. If you evaluate the same polynomial at many points, Horner’s method remains O(n) per point; precomputing coefficients does not change the per-point complexity but reduces constant factors. When using adaptive truncation (stopping once

∣ t er m_{k} ∣

ε),

the effective cost is O(

n_{e} ff

), where

n_{e} ff

is the smallest index meeting the accuracy target;

n_{e} ff

typically scales like

∣ x - a ∣

for entire functions (e.g.,

e^{x}

) but grows rapidly when

∣ x - a ∣

is large. For functions like

e^{x}

, range reduction (e.g., write

x = m

ln 2 + r,

∣ r ∣

≤ ln

\frac{2}{2}

) reduces

n_{e} ff

drastically by keeping

∣ r ∣

small and uses O(1) extra work to reconstruct the final value. Numerical differentiation via central differences uses a constant number of function evaluations, giving O(1) time (with a slightly larger constant if Richardson extrapolation is used) and O(1) space. However, the total cost may be dominated by the cost of evaluating f itself. From a stability perspective, high-degree expansions can magnify floating-point noise due to large alternating terms and cancellation; thus, even though the asymptotic truncation error shrinks with n, round-off can grow. Practically, one balances truncation and rounding by stopping when the next term no longer changes the sum (or when an error estimator plateaus).

Code Examples

Maclaurin approximation of e^x with adaptive truncation and an error bound

1 #include <iostream>\n#include <cmath>\n#include <limits>\n#include <iomanip>\n\nstruct ExpApprox {\n    long double value;      // approximated e^x\n    int terms;              // number of terms used (including the initial 1)\n    long double error_bound; // conservative bound via Lagrange remainder\n};\n\nExpApprox taylor_exp(long double x, long double eps = 1e-18L, int max_terms = 1000) {\n    // Compute e^x using its Maclaurin series with incremental terms.\n    // Sum_{k=0}^{\infty} x^k / k!\n    long double sum = 1.0L;  // k = 0 term\n    long double term = 1.0L; // current term value x^k / k!\n    int k = 0;\n    while (k < max_terms) {\n        ++k;\n        term *= x / k; // move from k-1 to k: multiply by x/k\n        sum += term;\n        if (fabsl(term) < eps) break; // stop when term is tiny\n    }\n    // Lagrange remainder bound: |R_{n}| <= e^{|x|} * |x|^{n+1}/(n+1)!\n    // The next term magnitude is |x|^{n+1}/(n+1)! = |term| * |x|/(n+1)\n    long double next_term = (k < std::numeric_limits<int>::max()) ? fabsl(term) * fabsl(x) / (k + 1) : 0.0L;\n    long double M = expl(fabsl(x));\n    long double error_bound = M * next_term; // conservative (often overestimates)\n    return {sum, k + 1, error_bound};\n}\n\nint main() {\n    std::cout.setf(std::ios::fixed); std::cout << std::setprecision(18);\n\n    for (long double x : {0.0L, 1.0L, -1.0L, 5.0L}) {\n        auto approx = taylor_exp(x, 1e-18L, 100000);\n        long double actual = expl(x);\n        std::cout << \"x = \" << (double)x\n                  << \": approx = \" << (double)approx.value\n                  << \", actual = \" << (double)actual\n                  << \", terms = \" << approx.terms\n                  << \", bound <= \" << (double)approx.error_bound\n                  << \", abs error = \" << (double)fabsl(approx.value - actual)\n                  << \"\\n\";\n    }\n    return 0;\n}\n

Uses the incremental-term method to evaluate the Maclaurin series of e^x, stopping when the term is smaller than a tolerance. The Lagrange remainder gives a conservative error bound using M = e^{|x|}. For large |x|, many terms are needed; range reduction can improve this in production code.

Time: O(n) where n is the number of terms usedSpace: O(1)

Generic Taylor approximation from derivative values at a center (applies to f(x+h))

1 #include <iostream>\n#include <vector>\n#include <cmath>\n#include <iomanip>\n\n// Evaluate sum_{k=0}^{n} f^{(k)}(a)/k! * h^k given derivative values at a.\nlong double taylor_from_derivs(const std::vector<long double>& derivs, long double h) {\n    long double sum = 0.0L;\n    long double term = 1.0L; // h^0 / 0!\n    for (size_t k = 0; k < derivs.size(); ++k) {\n        sum += derivs[k] * term;\n        term *= h / (static_cast<long double>(k) + 1.0L); // update to h^{k+1}/(k+1)!\n    }\n    return sum;\n}\n\nint main() {\n    std::cout.setf(std::ios::fixed); std::cout << std::setprecision(18);\n\n    // Example: approximate sin(0 + h) with derivatives at a = 0.\n    // Derivatives of sin at 0 cycle: [0, 1, 0, -1, 0, 1, 0, -1, ...]\n    const int N = 9;\n    std::vector<long double> derivs(N);\n    for (int k = 0; k < N; ++k) {\n        int r = k % 4;\n        if (r == 0) derivs[k] = 0.0L;       // sin(0)\n        else if (r == 1) derivs[k] = 1.0L;  // cos(0)\n        else if (r == 2) derivs[k] = 0.0L;  // -sin(0)\n        else derivs[k] = -1.0L;             // -cos(0)\n    }\n\n    long double h = 0.5L; // evaluate sin(0.5)\n    long double approx = taylor_from_derivs(derivs, h);\n    long double actual = sinl(h);\n\n    // Error bound using |f^{(n+1)}| <= 1 for sin: |R_n| <= |h|^{n+1}/(n+1)!\n    long double term_next = powl(fabsl(h), N) / tgammal(N + 1.0L); // h^N / N!\n    long double error_bound = fabsl(h) * term_next / (N + 1.0L);   // h^{N+1}/(N+1)!\n\n    std::cout << \"sin(0.5): approx = \" << (double)approx\n              << \", actual = \" << (double)actual\n              << \", N terms = \" << N\n              << \", bound <= \" << (double)error_bound\n              << \", abs error = \" << (double)fabsl(approx - actual)\n              << \"\\n\";\n\n    return 0;\n}\n

Given the derivatives at a center a (here a=0 for sin), this code evaluates the Taylor polynomial for f(a+h). It updates the h^k/k! factor incrementally to be fast and stable. For sin, all derivative magnitudes are ≤ 1, so the Lagrange bound simplifies nicely.

Time: O(n) where n is derivs.size()Space: O(1)

Numerical differentiation via central difference with Richardson extrapolation

1 #include <iostream>\n#include <functional>\n#include <cmath>\n#include <limits>\n#include <iomanip>\n\nstruct DerivResult {\n    double value;\n    double error_est;\n    int evals;\n    double h_used;\n};\n\nDerivResult derivative_central_richardson(const std::function<double(double)>& f, double x) {\n    // Choose a step roughly balancing truncation (O(h^2)) and rounding errors.\n    double eps = std::numeric_limits<double>::epsilon();\n    double h0 = cbrt(eps) * std::max(1.0, std::abs(x));\n    if (h0 == 0.0) h0 = cbrt(eps);\n\n    auto D = [&](double h){\n        double fp = f(x + h);\n        double fm = f(x - h);\n        return (fp - fm) / (2.0 * h);\n    };\n\n    double D1 = D(h0);        // O(h0^2)\n    double D2 = D(h0 * 0.5);  // O((h0/2)^2)\n\n    // Richardson extrapolation cancels the leading O(h^2) error.\n    double Dr = (4.0 * D2 - D1) / 3.0;\n    double err_est = std::abs(Dr - D2); // heuristic error estimate\n\n    return {Dr, err_est, 4, h0};\n}\n\nint main() {\n    std::cout.setf(std::ios::fixed); std::cout << std::setprecision(16);\n\n    auto fsin = [](double x){ return std::sin(x); };\n    auto fexp = [](double x){ return std::exp(x); };\n\n    {\n        double x = 1.0;\n        auto r = derivative_central_richardson(fsin, x);\n        double truth = std::cos(x);\n        std::cout << \"f'(x)=cos(x) at x=1: approx=\" << r.value\n                  << \", actual=\" << truth\n                  << \", est.err=\" << r.error_est\n                  << \", evals=\" << r.evals\n                  << \", h=\" << r.h_used << \"\\n\";\n    }\n\n    {\n        double x = -2.0;\n        auto r = derivative_central_richardson(fexp, x);\n        double truth = std::exp(x);\n        std::cout << \"f'(x)=e^x at x=-2: approx=\" << r.value\n                  << \", actual=\" << truth\n                  << \", est.err=\" << r.error_est\n                  << \", evals=\" << r.evals\n                  << \", h=\" << r.h_used << \"\\n\";\n    }\n\n    return 0;\n}\n

Central differences come directly from Taylor expansions at x±h and give an O(h^2) derivative. Combining two step sizes with Richardson extrapolation cancels the leading error, producing an O(h^4) estimate and a practical error indicator.

Time: O(1) (a constant number of function evaluations)Space: O(1)