#%config InlineBackend.figure_format = 'svg'
from pylab import *
from scipy.interpolate import barycentric_interpolate
Given integer n ≥ 0 n \ge 0 n ≥ 0 and a function f f f , we can approximate it by a degree n n n interpolating polynomial p p p . The approximation depends on the choice of nodes; different choice lead to different error f − p f-p f − p . Is there a polynomial which gives the smallest error ?
7.4.1 Best approximation in max norm ¶ Let f : [ − 1 , 1 ] → R f : [-1,1] \to \re f : [ − 1 , 1 ] → R be a continuous function. Given the degree n ≥ 0 n \ge 0 n ≥ 0 , find a p ∗ ∈ P n p^* \in \poly_n p ∗ ∈ P n for which the error
is minimized, i.e.,
∥ f − p ∗ ∥ ≤ ∥ f − p ∥ , ∀ p ∈ P n \norm{f-p^*} \le \norm{f-p}, \qquad \forall p \in \poly_n ∥ f − p ∗ ∥ ≤ ∥ f − p ∥ , ∀ p ∈ P n If such a p ∗ p^* p ∗ exists, then define
E n = E n ( f ) : = min p ∈ P n ∥ f − p ∥ = ∥ f − p ∗ ∥ E_n = E_n(f) := \min_{p \in \poly_n} \norm{f-p} = \norm{f-p^*} E n = E n ( f ) := p ∈ P n min ∥ f − p ∥ = ∥ f − p ∗ ∥ We will use the maximum norm
∥ ϕ ∥ = max x ∈ [ − 1 , + 1 ] ∣ ϕ ( x ) ∣ \norm{\phi} = \max_{x \in [-1,+1]} |\phi(x)| ∥ ϕ ∥ = x ∈ [ − 1 , + 1 ] max ∣ ϕ ( x ) ∣ This type of approximation is also called a minimax approximation since
E n = min p ∈ P n max x ∈ [ − 1 , 1 ] ∣ f ( x ) − p ( x ) ∣ = ∥ f − p ∗ ∥ E_n = \min_{p \in \poly_n} \max_{x \in [-1,1]} |f(x) - p(x)| = \norm{f-p^*} E n = p ∈ P n min x ∈ [ − 1 , 1 ] max ∣ f ( x ) − p ( x ) ∣ = ∥ f − p ∗ ∥ and E n E_n E n is the minimum achievable approximation error by any polynomial of degree n n n .
Example 1 (Degree 1 minimax for
exp ( x ) \exp(x) exp ( x ) )
Consider the function
f ( x ) = exp ( x ) , x ∈ [ − 1 , 1 ] f(x) = \exp(x), \qquad x \in [-1,1] f ( x ) = exp ( x ) , x ∈ [ − 1 , 1 ] and let us try to find the minimax approximation of the form
p 1 ( x ) = a 0 + a 1 x p_1(x) = a_0 + a_1 x p 1 ( x ) = a 0 + a 1 x First try to interpolate at x = − 1 , 1 x=-1,1 x = − 1 , 1 ;
Sourcexmin, xmax = -1.0, 1.0
f = lambda x: exp(x)
xg = linspace(xmin,xmax,1000)
fg = f(xg)
xu = linspace(xmin,xmax,2)
fu = f(xu)
fug = barycentric_interpolate(xu,fu,xg)
print("Max error = ", abs(fg - fug).max())
figure(figsize=(10,4))
subplot(1,2,1)
plot(xu,fu,'o')
plot(xg, fg, label='exp(x)')
plot(xg, fug, label='$p_1(x)$')
legend(), xlabel('x'), grid(True)
subplot(1,2,2)
plot(xu,0*xu,'o')
plot(xg,fg-fug)
title('$f(x)-p_1(x)$')
xlabel('x'), grid(True);
Max error = 0.5576031261175984
we observe that error is small at end points and large in the middle. The error f − p 1 f-p_1 f − p 1 does not oscillate. Clearly, by sliding the straight line around, we see that we can reduce the error in the middle while increasing it at the end points. Below we interpolate at x = − 0.5 , 0.7 x=-0.5,0.7 x = − 0.5 , 0.7 .
Sourcexu = array([-0.5,0.7])
fu = f(xu)
fug = barycentric_interpolate(xu,fu,xg)
print("Max error = ", abs(fg - fug).max())
figure(figsize=(10,4))
subplot(1,2,1)
plot(xu,fu,'o')
plot(xg, fg, label='exp(x)')
plot(xg, fug, label='$p_1(x)$')
legend(), xlabel('x'), grid(True)
subplot(1,2,2)
plot(xu,0*xu,'o')
plot(xg,fg-fug)
title('$f(x)-p_1(x)$')
xlabel('x'), grid(True);
Max error = 0.3527236090491077
The error is now smaller and the error curve will then oscillate taking negative and positive values.
The best choice is to make the maxima and minima of the error to be of same magnitude but opposite sign by adjusting the slope and position of the straight line.
Minimax approximation. By looking at the error curve, we see that the maximum error is achieved at the end points
f ( − 1 ) − p 1 ( − 1 ) = e − 1 − ( a 0 − a 1 ) = E 1 f ( 1 ) − p 1 ( 1 ) = e 1 − ( a 0 + a 1 ) = E 1 \begin{align}
f(-1) - p_1(-1) &= \ee^{-1} - (a_0 - a_1) = E_1 \\
f(1)-p_1(1) &= \ee^1 - (a_0 + a_1) = E_1
\end{align} f ( − 1 ) − p 1 ( − 1 ) f ( 1 ) − p 1 ( 1 ) = e − 1 − ( a 0 − a 1 ) = E 1 = e 1 − ( a 0 + a 1 ) = E 1 There is an intermediate point x 3 x_3 x 3 where the error is − E 1 -E_1 − E 1
f ( x 3 ) − p 1 ( x 3 ) = e x 3 − ( a 0 + a 1 x 3 ) = − E 1 f ′ ( x 3 ) − p 1 ′ ( x 3 ) = e x 3 − a 1 = 0 \begin{align}
f(x_3) - p_1(x_3) &= \ee^{x_3} - (a_0 + a_1 x_3) = -E_1 \\
f'(x_3) - p_1'(x_3) &= \ee^{x_3} - a_1 = 0
\end{align} f ( x 3 ) − p 1 ( x 3 ) f ′ ( x 3 ) − p 1 ′ ( x 3 ) = e x 3 − ( a 0 + a 1 x 3 ) = − E 1 = e x 3 − a 1 = 0 We have four equations for the four unknowns a 0 , a 1 , E 1 , x 3 a_0, a_1, E_1, x_3 a 0 , a 1 , E 1 , x 3 and the solution is
a 1 = 1 2 ( e − e − 1 ) ≈ 1.1752 x 3 = log ( a 1 ) ≈ 0.1614 E 1 = 1 2 e − 1 + x 3 4 ( e − e − 1 ) ≈ 0.2788 a 0 = E 1 + ( 1 − x 3 ) a 1 ≈ 1.2643 \begin{aligned}
a_1 &= \half(\ee - \ee^{-1}) \approx 1.1752 \\
x_3 &= \log(a_1) \approx 0.1614 \\
E_1 &= \half \ee^{-1} + \frac{x_3}{4}(\ee - \ee^{-1}) \approx 0.2788 \\
a_0 &= E_1 + (1-x_3) a_1 \approx 1.2643
\end{aligned} a 1 x 3 E 1 a 0 = 2 1 ( e − e − 1 ) ≈ 1.1752 = log ( a 1 ) ≈ 0.1614 = 2 1 e − 1 + 4 x 3 ( e − e − 1 ) ≈ 0.2788 = E 1 + ( 1 − x 3 ) a 1 ≈ 1.2643 so that
p 1 ∗ ( x ) ≈ 1.2643 + 1.1752 x p_1^*(x) \approx 1.2643 + 1.1752 x p 1 ∗ ( x ) ≈ 1.2643 + 1.1752 x Sourcep1 = lambda x: 1.2643 + 1.1752 * x
print("Max error = ", abs(fg - p1(xg)).max())
figure(figsize=(10,4))
subplot(1,2,1)
plot(xg, fg, label='exp(x)')
plot(xg, p1(xg), label='$p_1(x)$')
legend(), xlabel('x'), grid(True)
subplot(1,2,2)
plot(xg,fg-p1(xg))
a1 = 0.5*(exp(1)-1/exp(1))
x3 = log(a1)
E1 = 0.5*exp(-1) + 0.25*x3*(exp(1) - exp(-1))
plot([-1,-1],[0,E1]), text(-1,0.5*E1,'$+E_1$',ha='center',backgroundcolor='white')
plot([+1,+1],[0,E1]), text(+1,0.5*E1,'$+E_1$',ha='center',backgroundcolor='white')
plot([x3,x3],[0,-E1]), text(x3,-0.5*E1,'$-E_1$',ha='center',backgroundcolor='white')
plot([-1,x3,1],[0,0,0],'*')
title('$f(x)-p_1(x)$')
xlabel('x'), grid(True);
Max error = 0.27882229893333355
The error of minimax approximation takes the values + E 1 , − E 1 , + E 1 +E_1, -E_1, +E_1 + E 1 , − E 1 , + E 1 at some three points. We say that the error equioscillates at 1 + 2 = 3 1 + 2 = 3 1 + 2 = 3 points.
Definition 1 (Equioscillation)
We will say that a function ϕ : [ a , b ] → R \phi : [a,b] \to \re ϕ : [ a , b ] → R equioscillates at m m m points if there are m m m points x i x_i x i such that
ϕ ( x i ) = ± ( − 1 ) i E , 0 ≤ i ≤ m − 1 \phi(x_i) = \pm (-1)^i E, \qquad 0 \le i \le m-1 ϕ ( x i ) = ± ( − 1 ) i E , 0 ≤ i ≤ m − 1 where
E = max x ∈ [ a , b ] ∣ ϕ ( x ) ∣ E = \max_{x \in [a,b]} |\phi(x)| E = x ∈ [ a , b ] max ∣ ϕ ( x ) ∣ As we show in theorem below, equioscillation of the error f − p f-p f − p is a necessary and sufficient condition for the best approximation.
Example 2 (Cubic minimax for
exp ( x ) \exp(x) exp ( x ) )
We can find a cubic polynomial that gives the best approximation in maximum norm. The solution can be computed by the Remez algorithm and yields
p 3 ∗ ( x ) = 0.994579 + 0.995668 x + 0.542973 x 2 + 0.179533 x 3 p_3^*(x) = 0.994579 + 0.995668 x + 0.542973 x^2 + 0.179533 x^3 p 3 ∗ ( x ) = 0.994579 + 0.995668 x + 0.542973 x 2 + 0.179533 x 3 Compare the cubic minimax approximation to the cubic interpolating
polynomial.
Sourcexmin, xmax = -1.0, 1.0
f = lambda x: exp(x)
# Minimax
p3 = lambda x: 0.994579 + 0.995668 * x + 0.542973 * x**2 + 0.179533 * x**3
# For plotting and error
xg = linspace(xmin,xmax,1000)
fg = f(xg)
n = 3 # degree
# Uniform points
xu = linspace(xmin,xmax,n+1)
fu = f(xu)
fug = barycentric_interpolate(xu,fu,xg)
# Chebyshev points of I kind
m = n+1 # Number of points
j = linspace(0,m-1,m)
theta = (2*j+1)*pi/(2*m)
xc1 = cos(theta)
fc1 = f(xc1)
fc1g = barycentric_interpolate(xc1,fc1,xg)
# Chebyshev points of II kind
xc2 = cos(linspace(0,pi,n+1))
fc2 = f(xc2)
fc2g = barycentric_interpolate(xc2,fc2,xg)
print('Error, Minimax = ', abs(fg-p3(xg)).max())
print('Error, Uniform = ', abs(fg-fug).max())
print('Error, Chebyshev I = ', abs(fg-fc1g).max())
print('Error, Chebyshev II = ', abs(fg-fc2g).max())
plot(xg, fg-p3(xg), label='Cubic minimax')
plot(xg, fg-fug, '--', label='Cubic interp: Uniform')
plot(xg, fg-fc1g, '--', label='Cubic interp: Chebyshev I')
plot(xg, fg-fc2g, '--', label='Cubic interp: Chebyshev II')
legend(), grid(True), xlabel('x');
Error, Minimax = 0.005528828459044899
Error, Uniform = 0.009984724810795154
Error, Chebyshev I = 0.006656866235436709
Error, Chebyshev II = 0.010880165039992784
We see that the error of minimax approximation takes maximum and minimum values of the same magnitude E 3 ≈ 0.0055 E_3 \approx 0.0055 E 3 ≈ 0.0055 alternately at five points; the error equioscillates at 3 + 2 = 5 3+2=5 3 + 2 = 5 points. The interpolation at Chebyshev points of first kind comes closest to minimax approximation.
Example 3 (Using chebfun)
We can use chebfun to compute and visualize the best approximation. For the function f ( x ) = ∣ x ∣ f(x) = |x| f ( x ) = ∣ x ∣ in [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] , the Matlab code minimax_abs.m
shown below generates this figure.
Degree 4 minimax approximation of f ( x ) = ∣ x ∣ f(x) = |x| f ( x ) = ∣ x ∣ , x ∈ [ − 1 , 1 ] x\in [-1,1] x ∈ [ − 1 , 1 ] and its error which equioscillates at 7 points.
For the minimax approximation of degree 4, the error oscillates at 4 + 3 = 7 4 + 3 = 7 4 + 3 = 7 points.
close all; clear all
x = chebfun('x'); f = abs(x);
for n = 2:2:4
figure()
subplot(1,2,1), hold off, plot(f,'k','LineWidth',2), grid on
[p,err] = minimax(f,n);hold on,plot(p,'r','LineWidth',2),axis([-1 1 -.2 1.2])
FS = 'fontsize';
title(['Function and best approx, n = ' int2str(n)],FS,9)
subplot(1,2,2), hold off, plot(f-p,'LineWidth',2), grid on, hold on
axis([-1 1 -.15 .15]), title('Error curve',FS,9)
plot([-1 1],err*[1 1],'--k'), plot([-1 1],-err*[1 1],'--k')
end
7.4.2 Characterization of minimax ¶ (1) A continuous function f : [ − 1 , + 1 ] → R f:[-1,+1] \to \re f : [ − 1 , + 1 ] → R has a unique best approximation p ∗ ∈ P n p^* \in \poly_n p ∗ ∈ P n .
(2) Any polynomial p ∈ P n p \in \poly_n p ∈ P n is equal to p ∗ p^* p ∗ iff f − p f-p f − p equioscillates in atleast n + 2 n+2 n + 2 extreme points.
(1) Existence of best approximation : The function
ϕ : P n → R , ϕ ( p ) = ∥ f − p ∥ \phi: \poly_n \to \re, \qquad \phi(p) = \norm{f-p} ϕ : P n → R , ϕ ( p ) = ∥ f − p ∥ is a continuous[1] . Since one candidate approximation is the zero polynomial, we know that if p ∗ p^* p ∗ exists, then ∥ f − p ∗ ∥ ≤ ∥ f − 0 ∥ = ∥ f ∥ \norm{f- p^*} \le \norm{f-0} = \norm{f} ∥ f − p ∗ ∥ ≤ ∥ f − 0∥ = ∥ f ∥ , and hence p ∗ p^* p ∗ lies in
V n = { p ∈ P n : ∥ f − p ∥ ≤ ∥ f ∥ } ⊂ P n V_n = \{ p \in \poly_n : \norm{f-p} \le \norm{f} \} \subset \poly_n V n = { p ∈ P n : ∥ f − p ∥ ≤ ∥ f ∥ } ⊂ P n V n V_n V n is a closed[2] and bounded[3] subset of a finite dimensional normed space P n \poly_n P n ; hence V n V_n V n is compact (Kreyszig, 1978 , Theorem 2.5-3). Any continuous function on a compact set achieves its minimum and maximum. This proves the existence of p ∗ p^* p ∗ .
(2) Equioscillation ⟹ \implies ⟹ optimality : Suppose f − p f-p f − p takes equal extreme values ± E \pm E ± E with alternating signs at n + 2 n+2 n + 2 points x 0 < x 1 < … x n + 1 x_0 < x_1 < \ldots x_{n+1} x 0 < x 1 < … x n + 1 ; wlog we assume error = + E = +E = + E at x = x 0 x=x_0 x = x 0
( f − p ) ( x i ) = ( − 1 ) i E , i = 0 , 1 , … , n + 1 (f-p)(x_i) = (-1)^i E, \qquad i=0,1,\ldots,n+1 ( f − p ) ( x i ) = ( − 1 ) i E , i = 0 , 1 , … , n + 1 Suppose there exists a q ∈ P n q \in \poly_n q ∈ P n with smaller error, ∥ f − q ∥ < ∥ f − p ∥ \norm{f-q} < \norm{f-p} ∥ f − q ∥ < ∥ f − p ∥ . Then
( p − q ) ( x i ) = ( p − f ) ( x i ) + ( f − q ) ( x i ) = − ( − 1 ) i E + ( f − q ) ( x i ) (p-q)(x_i) = (p-f)(x_i) + (f-q)(x_i) = -(-1)^i E + (f-q)(x_i) ( p − q ) ( x i ) = ( p − f ) ( x i ) + ( f − q ) ( x i ) = − ( − 1 ) i E + ( f − q ) ( x i ) Hence, since − E ≤ ( f − q ) ( x ) ≤ + E -E \le (f-q)(x) \le +E − E ≤ ( f − q ) ( x ) ≤ + E , we get
( p − q ) ( x 0 ) = − E + ( f − q ) ( x 0 ) ⏟ < E < 0 ( p − q ) ( x 1 ) = + E + ( f − q ) ( x 1 ) ⏟ > − E > 0 etc. \begin{align}
(p-q)(x_0) &= -E + \underbrace{(f-q)(x_0)}_{<E} < 0 \\
(p-q)(x_1) &= +E + \underbrace{(f-q)(x_1)}_{> -E} > 0 \\
&\textrm{etc.}
\end{align} ( p − q ) ( x 0 ) ( p − q ) ( x 1 ) = − E + < E ( f − q ) ( x 0 ) < 0 = + E + > − E ( f − q ) ( x 1 ) > 0 etc. Hence p − q p-q p − q takes non-zero values with alternating signs at the n + 2 n+2 n + 2 equioscillation points, implying that p − q = 0 p-q=0 p − q = 0 in atleast n + 1 n+1 n + 1 points in between. Since p − q p-q p − q is of degree n n n , this implies that p = q p=q p = q everywhere, and hence ∥ f − q ∥ = ∥ f − p ∥ \norm{f-q} = \norm{f - p} ∥ f − q ∥ = ∥ f − p ∥ .
(3) Optimality ⟹ \implies ⟹ equioscillation at n + 2 n+2 n + 2 points : Let E = ∥ f − p ∥ E = \norm{f-p} E = ∥ f − p ∥ and suppose f − p f-p f − p equioscillates at fewer than n + 2 n+2 n + 2 points, say at k + 1 < n + 2 k+1 < n+2 k + 1 < n + 2 points. Without loss of generality, suppose the leftmost extremum is one where f − p f-p f − p takes the value − E -E − E . Then there are points − 1 < ξ 1 < … < ξ k < + 1 -1 < \xi_1 < \ldots < \xi_k < +1 − 1 < ξ 1 < … < ξ k < + 1 with k ≤ n k \le n k ≤ n where f − p f-p f − p is zero. Then p ( x ) − ϵ ( ξ 1 − x ) … ( ξ k − x ) p(x) - \epsilon (\xi_1-x)\ldots(\xi_k-x) p ( x ) − ϵ ( ξ 1 − x ) … ( ξ k − x ) is a better approximation of degree n n n than p p p for small ϵ > 0 \epsilon > 0 ϵ > 0 .
We can check this as follows. If x ∈ [ − 1 , ξ 1 ] x \in [-1,\xi_1] x ∈ [ − 1 , ξ 1 ] , then
( f − p ) ( x ) + ϵ ( ξ 1 − x ) … ( ξ k − x ) ≥ − E + ϵ ( + ) > − E (f-p)(x) + \epsilon (\xi_1-x)\ldots(\xi_k-x) \ge -E + \epsilon (+) > -E ( f − p ) ( x ) + ϵ ( ξ 1 − x ) … ( ξ k − x ) ≥ − E + ϵ ( + ) > − E If x ∈ [ ξ 1 , ξ 2 ] x \in [\xi_1,\xi_2] x ∈ [ ξ 1 , ξ 2 ] , then
( f − p ) ( x ) + ϵ ( ξ 1 − x ) … ( ξ k − x ) ≤ E + ϵ ( − ) < E (f-p)(x) + \epsilon (\xi_1-x)\ldots(\xi_k-x) \le E + \epsilon (-) < E ( f − p ) ( x ) + ϵ ( ξ 1 − x ) … ( ξ k − x ) ≤ E + ϵ ( − ) < E etc. Hence p p p is not the best approximation. However, if f − p f-p f − p equioscillates at atleast n + 2 n+2 n + 2 points, then we cannot improve it further by this argument since p ( x ) − ϵ ( ξ 1 − x ) … ( ξ k − x ) p(x) - \epsilon (\xi_1-x)\ldots(\xi_k-x) p ( x ) − ϵ ( ξ 1 − x ) … ( ξ k − x ) will be of degree ≥ n + 1 \ge n+1 ≥ n + 1 .
(4) Uniqueness : Suppose p p p and q q q are distinct best approximations. Define r = 1 2 ( p + q ) r = \half (p+q) r = 2 1 ( p + q ) . Since
E n = ∥ f − p ∥ = ∥ f − q ∥ E_n = \norm{f-p} = \norm{f-q} E n = ∥ f − p ∥ = ∥ f − q ∥ we have
∥ f − r ∥ ≤ 1 2 ∥ f − p ∥ + 1 2 ∥ f − q ∥ = E n \norm{f-r} \le \half \norm{f-p} + \half \norm{f-q} = E_n ∥ f − r ∥ ≤ 2 1 ∥ f − p ∥ + 2 1 ∥ f − q ∥ = E n We cannot have ∥ f − r ∥ < E n \norm{f-r} < E_n ∥ f − r ∥ < E n , hence ∥ f − r ∥ = E n \norm{f-r} = E_n ∥ f − r ∥ = E n and r r r is also a best approximation. So r r r must equioscillate in atleast n + 2 n+2 n + 2 extreme points. At any of these points x k x_k x k , suppose
( f − r ) ( x k ) = + E n ⇓ f ( x k ) − p ( x k ) 2 + f ( x k ) − q ( x k ) 2 = + E n \begin{align}
(f-r)(x_k) &= +E_n \\
&\Downarrow \\
\frac{f(x_k)-p(x_k)}{2} + \frac{f(x_k)-q(x_k)}{2} &= +E_n
\end{align} ( f − r ) ( x k ) 2 f ( x k ) − p ( x k ) + 2 f ( x k ) − q ( x k ) = + E n ⇓ = + E n But
− E n ≤ ( f − p ) ( x k ) ≤ E n , − E n ≤ ( f − q ) ( x k ) ≤ E n -E_n \le (f-p)(x_k) \le E_n, \qquad -E_n \le (f-q)(x_k) \le E_n − E n ≤ ( f − p ) ( x k ) ≤ E n , − E n ≤ ( f − q ) ( x k ) ≤ E n We cannot have
( f − p ) ( x k ) < E n and/or ( f − q ) ( x k ) < E n (f-p)(x_k) < E_n \qquad\textrm{and/or}\qquad (f-q)(x_k) < E_n ( f − p ) ( x k ) < E n and/or ( f − q ) ( x k ) < E n and hence
E n = f ( x k ) − p ( x k ) = f ( x k ) − q ( x k ) ⟹ p ( x k ) = q ( x k ) E_n = f(x_k) - p(x_k) = f(x_k) - q(x_k) \qquad \Longrightarrow \qquad p(x_k) = q(x_k) E n = f ( x k ) − p ( x k ) = f ( x k ) − q ( x k ) ⟹ p ( x k ) = q ( x k ) A similar conclusion follows if ( f − r ) ( x k ) = − E n (f-r)(x_k) = -E_n ( f − r ) ( x k ) = − E n . Since p = q p=q p = q at n + 2 n+2 n + 2 distinct points but each is of degree ≤ n \le n ≤ n , we conclude that p = q p=q p = q everywhere.
The error curve for a best approximation may have more than n + 2 n+2 n + 2 points
of equioscillation, and this will happen if f f f and n n n are both even or
both odd. For the function f ( x ) = ∣ x ∣ f(x)=|x| f ( x ) = ∣ x ∣ , the degree 2 polynomial
equioscillates at 5 points and the degree 4 polynomial equioscillates at
7 points.
7.4.3 Remez algorithm ¶ The algorithm to find the best approximation was proposed by E. Remez (1934) and is explained in Veidinger, 1960 . Let f : [ − 1 , 1 ] → R f : [-1,1] \to \re f : [ − 1 , 1 ] → R and let us write p ( x ) p(x) p ( x ) in terms of Chebyshev polynomials
p ( x ) = ∑ j = 0 n a j T j ( x ) p(x) = \sum_{j=0}^n a_j T_j(x) p ( x ) = j = 0 ∑ n a j T j ( x ) Choose n + 2 n+2 n + 2 points { x 0 , x 1 , … , x n + 1 } ⊂ [ − 1 , 1 ] \{x_0, x_1, \ldots, x_{n+1}\} \subset [-1,1] { x 0 , x 1 , … , x n + 1 } ⊂ [ − 1 , 1 ] and such that
f ( x i ) − p ( x i ) = ( − 1 ) i E p ( x i ) + ( − 1 ) i E = f ( x i ) ∑ j = 0 n a j T j ( x i ) + ( − 1 ) i E = f ( x i ) , 0 ≤ i ≤ n + 1 \begin{align}
f(x_i) - p(x_i) &= (-1)^i E \\
p(x_i) + (-1)^i E &= f(x_i) \\
\sum_{j=0}^n a_j T_j(x_i) + (-1)^i E &= f(x_i), \qquad 0 \le i \le n+1
\end{align} f ( x i ) − p ( x i ) p ( x i ) + ( − 1 ) i E j = 0 ∑ n a j T j ( x i ) + ( − 1 ) i E = ( − 1 ) i E = f ( x i ) = f ( x i ) , 0 ≤ i ≤ n + 1 This is a linear system of n + 2 n+2 n + 2 equations for the n + 2 n+2 n + 2 unknowns { a 0 , a 1 , … , a n + 1 , E } \{a_0, a_1, \ldots, a_{n+1}, E\} { a 0 , a 1 , … , a n + 1 , E } which can be solved. The E E E may be positive or negative.
If the condition E = ± max x ∣ f ( x ) ∣ E = \pm \max_x |f(x)| E = ± max x ∣ f ( x ) ∣ is satisfied, then stop the iteration. Otherwise continue with the next step.
The error oscillates in sign at the n + 2 n+2 n + 2 points so it is zero at some n + 1 n+1 n + 1 intermediate points { z 0 , z 1 , … , z n } \{z_0, z_1, \ldots, z_n\} { z 0 , z 1 , … , z n } .
Find z j ∈ [ x j , x j + 1 ] , f ( z j ) − p ( z j ) = 0 , 0 ≤ j ≤ n \textrm{Find} \qquad z_j \in [x_j, x_{j+1}], \qquad f(z_j) - p(z_j) = 0,
\qquad 0 \le j \le n Find z j ∈ [ x j , x j + 1 ] , f ( z j ) − p ( z j ) = 0 , 0 ≤ j ≤ n by using some root finding method.
In the n + 2 n+2 n + 2 intervals [ − 1 , z 0 ] , [ z 0 , z 1 ] , … , [ z n , 1 ] [-1,z_0], [z_0, z_1], \ldots, [z_n, 1] [ − 1 , z 0 ] , [ z 0 , z 1 ] , … , [ z n , 1 ] , find the points { x 0 ∗ , x 1 ∗ , … , x n + 1 ∗ } \{x_0^*, x_1^*, \ldots, x_{n+1}^*\} { x 0 ∗ , x 1 ∗ , … , x n + 1 ∗ } where the error f − p f-p f − p takes maximum or minimum value. Do this by solving ( f ′ − p ′ ) ( x ) = 0 (f'-p')(x)=0 ( f ′ − p ′ ) ( x ) = 0 and also check at the end points of the intervals.
Set x j = x j ∗ x_j = x_j^* x j = x j ∗ and go to Step 1.
We can observe that the algorithm is complicated and expensive. The need to find several roots and maxima/minima adds to the cost.
7.4.4 Minimax versus interpolation ¶ Given degree n ≥ 0 n \ge 0 n ≥ 0 let p n ∗ p_n^* p n ∗ be the minimax approximation and p n p_n p n be some other interpolation which can be written as
p n ( x ) = ∑ j = 0 n f ( x j ) ℓ j ( x ) p_n(x) = \sum_{j=0}^n f(x_j) \ell_j(x) p n ( x ) = j = 0 ∑ n f ( x j ) ℓ j ( x ) where ℓ j \ell_j ℓ j are the Lagrange polynomials. Then we can write
p n ∗ ( x ) = ∑ j = 0 n p n ∗ ( x j ) ℓ j ( x ) p_n^*(x) = \sum_{j=0}^n p_n^*(x_j) \ell_j(x) p n ∗ ( x ) = j = 0 ∑ n p n ∗ ( x j ) ℓ j ( x ) and
p n ∗ ( x ) − p n ( x ) = ∑ j = 0 n ( p n ∗ ( x j ) − f ( x j ) ) ℓ j ( x ) ∣ p n ∗ ( x ) − p n ( x ) ∣ ≤ ∑ j = 0 n ∣ p n ∗ ( x j ) − f ( x j ) ∣ ⋅ ∣ ℓ j ( x ) ∣ ≤ ( ∑ j = 0 n ∣ ℓ j ( x ) ∣ ) ∥ f − p n ∗ ∥ ∥ p n ∗ − p n ∥ ≤ ( max x ∑ j = 0 n ∣ ℓ j ( x ) ∣ ) ∥ f − p n ∗ ∥ \begin{align}
p_n^*(x) - p_n(x) &= \sum_{j=0}^n (p_n^*(x_j) - f(x_j)) \ell_j(x) \\
|p_n^*(x) - p_n(x)| &\le \sum_{j=0}^n |p_n^*(x_j) - f(x_j)| \cdot |\ell_j(x)| \\
&\le \left(\sum_{j=0}^n |\ell_j(x)| \right) \norm{f - p_n^*} \\
\norm{p_n^* - p_n} &\le \left( \max_x \sum_{j=0}^n |\ell_j(x)| \right) \norm{f - p_n^*}
\end{align} p n ∗ ( x ) − p n ( x ) ∣ p n ∗ ( x ) − p n ( x ) ∣ ∥ p n ∗ − p n ∥ = j = 0 ∑ n ( p n ∗ ( x j ) − f ( x j )) ℓ j ( x ) ≤ j = 0 ∑ n ∣ p n ∗ ( x j ) − f ( x j ) ∣ ⋅ ∣ ℓ j ( x ) ∣ ≤ ( j = 0 ∑ n ∣ ℓ j ( x ) ∣ ) ∥ f − p n ∗ ∥ ≤ ( x max j = 0 ∑ n ∣ ℓ j ( x ) ∣ ) ∥ f − p n ∗ ∥ Define the Lebesgue constant
Λ n = max x ∑ j = 0 n ∣ ℓ j ( x ) ∣ \Lambda_n = \max_x \sum_{j=0}^n |\ell_j(x)| Λ n = x max j = 0 ∑ n ∣ ℓ j ( x ) ∣ whose value depends on the distribution of points. Therefore
∥ f − p n ∥ = ∥ f − p n ∗ + p n ∗ − p n ∥ ≤ ∥ f − p n ∗ ∥ + ∥ p n ∗ − p n ∥ ≤ ( 1 + Λ n ) ∥ f − p n ∗ ∥ \begin{align}
\norm{f - p_n}
&= \norm{f - p_n^* + p_n^* - p_n} \\
& \le \norm{f - p_n^*} + \norm{p_n^* - p_n} \\
&\le (1 + \Lambda_n) \norm{f - p_n^*}
\end{align} ∥ f − p n ∥ = ∥ f − p n ∗ + p n ∗ − p n ∥ ≤ ∥ f − p n ∗ ∥ + ∥ p n ∗ − p n ∥ ≤ ( 1 + Λ n ) ∥ f − p n ∗ ∥ The interpolation error is atmost a factor 1 + Λ n 1 + \Lambda_n 1 + Λ n larger than the error of the best approximation. If Λ n ≫ 1 \Lambda_n \gg 1 Λ n ≫ 1 , then p n p_n p n may be too far from the best approximation error.
Example 4 (Lebesgue function)
Define the Lebesgue function
λ n ( x ) = ∑ j = 0 n ∣ ℓ j ( x ) ∣ \lambda_n(x) = \sum_{j=0}^n |\ell_j(x)| λ n ( x ) = j = 0 ∑ n ∣ ℓ j ( x ) ∣ def Lagrange(j,xi,x):
f = ones_like(x)
n = len(xi)
for k in range(n):
if k != j:
f = f * (x - xi[k]) / (xi[j] - xi[k])
# If x close to xi[j], set to 1
f[argwhere(abs(xi[j]-x) < 1.0e-13)] = 1.0
return f
def LebesgueFunction(xi, x):
f = zeros_like(x)
n = len(xi)
for i in range(n):
f += abs(Lagrange(i,xi,x))
return f
def PlotLebesgueFunction(xi,logy=False,ng=100):
if logy:
semilogy(xi,ones_like(xi),'o')
else:
plot(xi,0*xi,'o')
for i in range(len(xi)-1):
xg = linspace(xi[i],xi[i+1],ng)
if logy:
semilogy(xg, LebesgueFunction(xi,xg), 'k-')
else:
plot(xg, LebesgueFunction(xi,xg), 'k-')
grid(True), xlabel('x');
With uniform points
n = 14
xi = linspace(-1,1,n+1)
PlotLebesgueFunction(xi)
title('Lebesgue function for '+str(n+1)+' uniform points');
n = 29
xi = linspace(-1,1,n+1)
PlotLebesgueFunction(xi,logy=True)
title('Lebesgue function for '+str(n+1)+' uniform points');
Note the log scale in above figure. The large values arise near the end points where we also observe Runge phenomenon.
With Chebyshev points
n = 14
xi = cos(linspace(0,pi,n+1))
PlotLebesgueFunction(xi)
title('Lebesgue function for '+str(n+1)+' Chebyshev points');
n = 29
xi = cos(linspace(0,pi,n+1))
PlotLebesgueFunction(xi)
title('Lebesgue function for '+str(n+1)+' Chebyshev points');
The values are small and grow slowly with n n n .
Theorem 2 (Lebesgue constant)
(1) The Lebesgue constants Λ n \Lambda_n Λ n for degree n ≥ 0 n \ge 0 n ≥ 0 polynomial interpolation in any set of n + 1 n+1 n + 1 distinct points in [ − 1 , 1 ] [-1,1] [ − 1 , 1 ] satisfy
Λ n ≥ 2 π log ( n + 1 ) + C , C = 2 π ( γ + log ( 4 / π ) ) ≈ 0.52125 \Lambda_n \ge \frac{2}{\pi} \log(n+1) + C, \qquad C = \frac{2}{\pi}(\gamma + \log(4/\pi)) \approx 0.52125 Λ n ≥ π 2 log ( n + 1 ) + C , C = π 2 ( γ + log ( 4/ π )) ≈ 0.52125 where γ ≈ 0.577 \gamma \approx 0.577 γ ≈ 0.577 is Euler’s constant.
(2) For Chebyshev points, they satisfy
Λ n ≤ 2 π log ( n + 1 ) + 1 \Lambda_n \le \frac{2}{\pi} \log(n+1) + 1 Λ n ≤ π 2 log ( n + 1 ) + 1 (3) For equispaced points they satisfy
Λ n > 2 n − 2 n 2 \Lambda_n > \frac{2^{n-2}}{n^2} Λ n > n 2 2 n − 2 See discussion after Theorem 15.2 in Trefethen, 2019 for a history of these results.
For n = 100 n=100 n = 100
uniform points: Λ 100 > 3 × 1 0 25 \Lambda_{100} > 3 \times 10^{25} Λ 100 > 3 × 1 0 25 Chebyshev points: Λ 100 < 1.8 \Lambda_{100} < 1.8 Λ 100 < 1.8 A large Lebesgue constant also means there is possibility of large errors in computation arising due to errors in data. If the function value is not known accurately and there is an error, then we are evaluating
p ~ ( x ) = ∑ j = 0 m ( f j + ϵ j ) ℓ j ( x ) \tilde p(x) = \sum_{j=0}^m (f_j + \epsilon_j) \ell_j(x) p ~ ( x ) = j = 0 ∑ m ( f j + ϵ j ) ℓ j ( x ) instead of
p ( x ) = ∑ j = 0 m f j ℓ j ( x ) p(x) = \sum_{j=0}^m f_j \ell_j(x) p ( x ) = j = 0 ∑ m f j ℓ j ( x ) The relative error
∣ p ~ ( x ) − p ( x ) ∣ max j ∣ ϵ j ∣ ≤ ∑ j = 0 n ∣ ℓ j ( x ) ∣ \frac{|\tilde p(x) - p(x)|}{\max_j |\epsilon_j|} \le \sum_{j=0}^n |\ell_j(x)| max j ∣ ϵ j ∣ ∣ p ~ ( x ) − p ( x ) ∣ ≤ j = 0 ∑ n ∣ ℓ j ( x ) ∣ and if the rhs is large, then there is possibility of large error in the evaluated polynomial.
This is true for any normed vector space and is a consequence of
∣ ∥ x ∥ − ∥ y ∥ ∣ ≤ ∥ x − y ∥ |\norm{x} - \norm{y}| \le \norm{x-y} ∣∥ x ∥ − ∥ y ∥∣ ≤ ∥ x − y ∥ .
Show that it is closed by showing all limit points belong to
V n V_n V n .
It is bounded because
∥ p ∥ = ∥ p − f + f ∥ ≤ ∥ p − f ∥ + ∥ f ∥ ≤ ∥ f ∥ + ∥ f ∥ = 2 ∥ f ∥ < ∞ \norm{p} = \norm{p-f+f} \le \norm{p-f} + \norm{f} \le \norm{f} + \norm{f} =
2\norm{f} < \infty ∥ p ∥ = ∥ p − f + f ∥ ≤ ∥ p − f ∥ + ∥ f ∥ ≤ ∥ f ∥ + ∥ f ∥ = 2∥ f ∥ < ∞ Kreyszig, E. (1978). Introductory Functional Analysis With Applications . John Wiley & Sons, Inc. Veidinger, L. (1960). On the numerical determination of the best approximations in the Chebyshev sense. Numerische Mathematik , 2 (1), 99–105. 10.1007/BF01386215 Trefethen, L. N. (2019). Approximation Theory and Approximation Practice, Extended Edition . Society for Industrial. 10.1137/1.9781611975949