Polynomials are the simplest functions we can use since they require only basic arithmetic operations. They can provide a good approximation as shown by Weirstrass Theorem.
0.1 Interpolation using monomials ¶ Given N + 1 N+1 N + 1 data ( x i , y i ) (x_i, y_i) ( x i , y i ) , i = 0 , 1 , … , N i=0,1,\ldots,N i = 0 , 1 , … , N with y i = f ( x i ) y_i = f(x_i) y i = f ( x i ) , we
can try to find a polynomial of degree N N N
p ( x ) = a 0 + a 1 x + … + a N x N p(x) = a_0 + a_1 x + \ldots + a_N x^N p ( x ) = a 0 + a 1 x + … + a N x N that satisfies the interpolation conditions
p ( x i ) = y i , i = 0 , 1 , … , N p(x_i) = y_i, \qquad i=0,1,\ldots, N p ( x i ) = y i , i = 0 , 1 , … , N This is a linear system of N + 1 N+1 N + 1 equations that can be written in matrix form
[ 1 x 0 x 0 2 … x 0 N 1 x 1 x 1 2 … x 1 N ⋮ ⋮ … ⋮ 1 x N x N 2 … x N N ] ⏟ V [ a 0 a 1 ⋮ a N ] = [ y 0 y 1 ⋮ y N ] \underbrace{\begin{bmatrix}
1 & x_0 & x_0^2 & \ldots & x_0^N \\
1 & x_1 & x_1^2 & \ldots & x_1^N \\
\vdots & \vdots & \ldots & \vdots \\
1 & x_N & x_N^2 & \ldots & x_N^N
\end{bmatrix}}_{V}
\begin{bmatrix}
a_0 \\ a_1 \\ \vdots \\ a_N \end{bmatrix} =
\begin{bmatrix}
y_0 \\ y_1 \\ \vdots \\ y_N \end{bmatrix} V ⎣ ⎡ 1 1 ⋮ 1 x 0 x 1 ⋮ x N x 0 2 x 1 2 … x N 2 … … ⋮ … x 0 N x 1 N x N N ⎦ ⎤ ⎣ ⎡ a 0 a 1 ⋮ a N ⎦ ⎤ = ⎣ ⎡ y 0 y 1 ⋮ y N ⎦ ⎤ This has a unique solution if the Vandermonde matrix V V V is non-singular. Since
det V = ∏ j = 0 N ∏ k = j + 1 N ( x k − x j ) \det V = \prod_{j=0}^N \prod_{k=j+1}^N (x_k - x_j) det V = j = 0 ∏ N k = j + 1 ∏ N ( x k − x j ) we can solve the problem provided the points { x i } \{ x_i \} { x i } are distinct.
V V V is non-singular.
We can show this without computing the determinant. Assume that the { x i } \{ x_i \} { x i } are distinct. It is enough to show that the only solution of V a = 0 Va=0 Va = 0 is a = 0 a=0 a = 0 . Note that the set of N + 1 N+1 N + 1 equations V a = 0 Va=0 Va = 0 is of the form
p ( x i ) = 0 , i = 0 , 1 , … , N p(x_i) = 0, \qquad i=0,1,\ldots,N p ( x i ) = 0 , i = 0 , 1 , … , N which implies that p ( x ) p(x) p ( x ) has N + 1 N+1 N + 1 distinct roots. But since p p p is a polynomial of degree N N N , this implies that p ( x ) ≡ 0 p(x) \equiv 0 p ( x ) ≡ 0 and hence each a i = 0 a_i = 0 a i = 0 .
0.2 Condition number of V V V ¶ If we want to solve the interpolation problem by solving the matrix problem, then the condition number of the matrix becomes important. Matrices with large condition numbers cannot solved accurately on a computer due to blow-up in round-off errors.
Take ( x 0 , x 1 , x 2 ) = ( 100 , 101 , 102 ) (x_0,x_1,x_2) = (100,101,102) ( x 0 , x 1 , x 2 ) = ( 100 , 101 , 102 ) . Then
V = [ 1 100 10000 1 101 10201 1 102 10402 ] , cond ( V ) ≈ 1 0 8 V = \begin{bmatrix}
1 & 100 & 10000 \\
1 & 101 & 10201 \\
1 & 102 & 10402 \end{bmatrix}, \qquad \cond(V) \approx 10^8 V = ⎣ ⎡ 1 1 1 100 101 102 10000 10201 10402 ⎦ ⎤ , cond ( V ) ≈ 1 0 8 If we scale the x x x as x → x x 0 x \to \frac{x}{x_0} x → x 0 x , then
V ~ = [ 1 1.00 1.0000 1 1.01 1.0201 1 1.02 1.0402 ] , cond ( V ~ ) ≈ 1 0 5 \tilde{V} = \begin{bmatrix}
1 & 1.00 & 1.0000 \\
1 & 1.01 & 1.0201 \\
1 & 1.02 & 1.0402 \end{bmatrix}, \qquad \cond(\tilde{V}) \approx 10^5 V ~ = ⎣ ⎡ 1 1 1 1.00 1.01 1.02 1.0000 1.0201 1.0402 ⎦ ⎤ , cond ( V ~ ) ≈ 1 0 5 Or we can shift the origin to x 0 x_0 x 0
V ^ = [ 1 0 0 1 1 1 1 2 4 ] , cond ( V ^ ) ≈ 14 \hat{V} = \begin{bmatrix}
1 & 0 & 0 \\
1 & 1 & 1 \\
1 & 2 & 4 \end{bmatrix}, \qquad \cond(\hat{V}) \approx 14 V ^ = ⎣ ⎡ 1 1 1 0 1 2 0 1 4 ⎦ ⎤ , cond ( V ^ ) ≈ 14 The condition number can be improved by scaling and/or shifting the variables.
It is usually better to map the data to the interval [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] or [ 0 , 1 ] [0,1] [ 0 , 1 ] and then solve the interpolation problem on the mapped interval.
Assuming that x 0 < x 1 < … x N x_0 < x_1 < \ldots
x_N x 0 < x 1 < … x N , define
ξ = x − x 0 x N − x 0 \xi = \frac{x - x_0}{x_N - x_0} ξ = x N − x 0 x − x 0 and find a polynomial of the form
p ( ξ ) = a 0 + a 1 ξ + … + a N ξ N p(\xi) = a_0 + a_1 \xi + \ldots + a_N \xi^N p ( ξ ) = a 0 + a 1 ξ + … + a N ξ N But still the condition number increases rapidly with N N N . For N = 20 N=20 N = 20 we have a condition number of 8 × 1 0 8 8 \times 10^8 8 × 1 0 8 even when using the interval [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] .
We will use the cond
function from numpy.linalg
to compute the condition number. By default, it uses the 2-norm.
Nvalues, Cvalues = [], []
for N in range(1,30):
x = linspace(-1.0,+1.0,N+1)
V = zeros((N+1,N+1))
for j in range(0,N+1):
V[j][:] = x**j # This is transpose of V as defined above.
Nvalues.append(N), Cvalues.append(cond(V.T))
semilogy(Nvalues, Cvalues, 'o-')
xlabel('N'), ylabel('cond(V)'), grid(True);
The condition number is large for even moderate value of N = 30 N=30 N = 30 .
0.3 Lagrange interpolation ¶ Lagrange interpolation provides the solution without having to solve a
matrix problem. Define
π i ( x ) = ( x − x 0 ) … ( x − x i − 1 ) ( x − x i + 1 ) … ( x − x N ) \pi_i(x) = (x-x_0) \ldots (x-x_{i-1})(x-x_{i+1})\ldots (x-x_N) π i ( x ) = ( x − x 0 ) … ( x − x i − 1 ) ( x − x i + 1 ) … ( x − x N ) and
ℓ i ( x ) = π i ( x ) π i ( x i ) \ell_i(x) = \frac{\pi_i(x)}{\pi_i(x_i)} ℓ i ( x ) = π i ( x i ) π i ( x ) Note that each ℓ i \ell_i ℓ i is a
polynomial of degree N N N and ℓ i ( x j ) = δ i j \ell_i(x_j) = \delta_{ij} ℓ i ( x j ) = δ ij , i.e.,
ℓ i ( x i ) = 0 , ℓ i ( x j ) = 0 , j ≠ i \ell_i(x_i) = 0, \qquad \ell_i(x_j) = 0, \quad j \ne i ℓ i ( x i ) = 0 , ℓ i ( x j ) = 0 , j = i Then consider
the polynomial of degree N N N given by
p N ( x ) = ∑ j = 0 N y j ℓ j ( x ) p_N(x) = \sum_{j=0}^N y_j \ell_j(x) p N ( x ) = j = 0 ∑ N y j ℓ j ( x ) By construction this satisfies
p N ( x i ) = y i , i = 0 , 1 , … , N p_N(x_i) = y_i, \qquad i=0,1,\ldots,N p N ( x i ) = y i , i = 0 , 1 , … , N and hence is the solution to
the interpolation problem.
Initially, let us use some python functions to compute the interpolating
polynomial. In particular, we use polyfit
and polyval
functions. We
will demonstrate this for the function
f ( x ) = cos ( 4 π x ) , x ∈ [ 0 , 1 ] f(x) = \cos(4\pi x), \qquad x \in [0,1] f ( x ) = cos ( 4 π x ) , x ∈ [ 0 , 1 ] Note that polyfit actually performs a least squares fit, but we give N + 1 N+1 N + 1 points and ask for a degree N N N polynomial, which gives an interpolating polynomial.
Define the function
xmin, xmax = 0.0, 1.0
f = lambda x: cos(4*pi*x)
Make a grid and evaluate function at those points.
N = 8 # degree, we need N+1 points
x = linspace(xmin, xmax, N+1)
y = f(x)
Construct polynomial using polyfit
Now we evaluate this on larger grid for better visualization
M = 100
xe = linspace(xmin, xmax, M)
ye = f(xe) # exact function
yp = polyval(p,xe)
plot(x,y,'o',xe,ye,'--',xe,yp,'-')
legend(('Data points','Exact function','Polynomial'))
title('Degree '+str(N)+' interpolation');
1 Error in polynomial approximation ¶ Let p N ( x ) p_N(x) p N ( x ) be the degree N N N polynomial which interpolates the given
data ( x i , y i ) (x_i,y_i) ( x i , y i ) , i = 0 , 1 , … , N i=0,1,\ldots,N i = 0 , 1 , … , N . Then the error
e N ( x ) = f ( x ) − p N ( x ) e_N(x) = f(x) - p_N(x) e N ( x ) = f ( x ) − p N ( x ) is bounded by
∣ e N ( x ) ∣ ≤ M N + 1 ( N + 1 ) ! ∣ ( x − x 0 ) ( x − x 1 ) … ( x − x N ) ∣ |e_N(x)| \le \frac{M_{N+1}}{(N+1)!} |(x-x_0)(x-x_1)\ldots(x-x_N)| ∣ e N ( x ) ∣ ≤ ( N + 1 )! M N + 1 ∣ ( x − x 0 ) ( x − x 1 ) … ( x − x N ) ∣ Assume for simplicity that x 0 < x 1 < … < x N x_0 < x_1 < \ldots < x_N x 0 < x 1 < … < x N . Since
e N ( x i ) = 0 , i = 0 , 1 , … , N e_N(x_i) = 0, \qquad i=0,1,\ldots,N e N ( x i ) = 0 , i = 0 , 1 , … , N we can write the error as
e N ( x ) = f ( x ) − p N ( x ) = ( x − x 0 ) ( x − x 1 ) … ( x − x N ) K ( x ) e_N(x) = f(x) - p_N(x) = (x-x_0)(x-x_1)\ldots(x-x_N) K(x) e N ( x ) = f ( x ) − p N ( x ) = ( x − x 0 ) ( x − x 1 ) … ( x − x N ) K ( x ) for some function K ( x ) K(x) K ( x ) . Choose an arbitrary x ∗ ∈ [ x 0 , x N ] x_* \in [x_0,x_N] x ∗ ∈ [ x 0 , x N ] , different from all the x i x_i x i and define
Φ ( x ) = f ( x ) − p N ( x ) − ( x − x 0 ) ( x − x 1 ) … ( x − x N ) K ( x ∗ ) \Phi(x) = f(x) - p_N(x) - (x-x_0)(x-x_1)\ldots(x-x_N) K(x_*) Φ ( x ) = f ( x ) − p N ( x ) − ( x − x 0 ) ( x − x 1 ) … ( x − x N ) K ( x ∗ ) If f ( x ) f(x) f ( x ) has ( N + 1 ) (N+1) ( N + 1 ) derivatives, then we can differetiate N + 1 N+1 N + 1 times to get
Φ ( N + 1 ) ( x ) = f ( N + 1 ) ( x ) − ( N + 1 ) ! K ( x ∗ ) \Phi^{(N+1)}(x) = f^{(N+1)}(x) - (N+1)! K(x_*) Φ ( N + 1 ) ( x ) = f ( N + 1 ) ( x ) − ( N + 1 )! K ( x ∗ ) Clearly, Φ ( x ) \Phi(x) Φ ( x ) vanishes at N + 2 N+2 N + 2 points, { x 0 , x 1 , … , x N , x ∗ } \{x_0,x_1,\ldots,x_N,x_*\} { x 0 , x 1 , … , x N , x ∗ } . By mean value theorem, Φ ′ ( x ) \Phi'(x) Φ ′ ( x ) vanishes at atleast N + 1 N+1 N + 1 points in the interval [ x 0 , x N ] [x_0,x_N] [ x 0 , x N ] , Φ ′ ′ ( x ) \Phi''(x) Φ ′′ ( x ) vanishes at atleast N N N points in [ x 0 , x N ] [x_0,x_N] [ x 0 , x N ] , etc. Continuing this way Φ ( N + 1 ) ( x ) \Phi^{(N+1)}(x) Φ ( N + 1 ) ( x ) vanishes in atleast one point x ˉ ∈ [ x 0 , x N ] \bar{x} \in [x_0,x_N] x ˉ ∈ [ x 0 , x N ]
f ( N + 1 ) ( x ˉ ) − ( N + 1 ) ! K ( x ∗ ) = 0 ⟹ K ( x ∗ ) = 1 ( N + 1 ) ! f ( N + 1 ) ( x ˉ ) f^{(N+1)}(\bar{x}) - (N+1)! K(x_*) = 0 \quad\Longrightarrow\quad K(x_*) = \frac{1}
{(N+1)!} f^{(N+1)}(\bar{x}) f ( N + 1 ) ( x ˉ ) − ( N + 1 )! K ( x ∗ ) = 0 ⟹ K ( x ∗ ) = ( N + 1 )! 1 f ( N + 1 ) ( x ˉ ) and hence
f ( x ∗ ) = p N ( x ∗ ) + 1 ( N + 1 ) ! ( x ∗ − x 0 ) ( x ∗ − x 1 ) … ( x ∗ − x N ) f ( N + 1 ) ( x ˉ ) f(x_*) = p_N(x_*) + \frac{1}{(N+1)!} (x_*-x_0)(x_*-x_1)\ldots(x_*-x_N) f^{(N+1)}
(\bar{x}) f ( x ∗ ) = p N ( x ∗ ) + ( N + 1 )! 1 ( x ∗ − x 0 ) ( x ∗ − x 1 ) … ( x ∗ − x N ) f ( N + 1 ) ( x ˉ ) Since x ∗ x_* x ∗ was arbitrary, we can call it x x x and we obtain the error formula
e N ( x ) = f ( x ) − p N ( x ) = 1 ( N + 1 ) ! ( x − x 0 ) ( x − x 1 ) … ( x − x N ) f ( N + 1 ) ( x ˉ ) e_N(x) = f(x) - p_N(x) = \frac{1}{(N+1)!} (x-x_0)(x-x_1)\ldots(x-x_N) f^{(N+1)}
(\bar{x}) e N ( x ) = f ( x ) − p N ( x ) = ( N + 1 )! 1 ( x − x 0 ) ( x − x 1 ) … ( x − x N ) f ( N + 1 ) ( x ˉ ) Defining
M N + 1 = max x ∈ [ x 0 , x N ] ∣ f ( N + 1 ) ( x ) ∣ M_{N+1} = \max_{x \in [x_0,x_N]}|f^{(N+1)}(x)| M N + 1 = x ∈ [ x 0 , x N ] max ∣ f ( N + 1 ) ( x ) ∣ we obtain the error bound.
Let us specialize the error estimate to the case of uniformly spaced
points in the interval [ a , b ] [a,b] [ a , b ] with
x i = a + i h , 0 ≤ i ≤ N , h = b − a N x_i = a + i h, \qquad 0 \le i \le N, \qquad h = \frac{b-a}{N} x i = a + ih , 0 ≤ i ≤ N , h = N b − a We have x 0 = a x_0 = a x 0 = a and x N = b x_N = b x N = b .
Fix an x x x and find j j j such that x j ≤ x ≤ x j + 1 x_j \le x \le x_{j+1} x j ≤ x ≤ x j + 1 . Show that (See Assignment)
∣ ( x − x j ) ( x − x j + 1 ) ∣ ≤ 1 4 h 2 |(x-x_j)(x-x_{j+1})| \le \frac{1}{4}h^2 ∣ ( x − x j ) ( x − x j + 1 ) ∣ ≤ 4 1 h 2 Hence
∏ i = 0 N ∣ x − x i ∣ ≤ h 2 4 ∏ i = 0 j − 1 ( x − x i ) ∏ i = j + 2 N ( x i − x ) \prod_{i=0}^N |x - x_i| \le \frac{h^2}{4} \prod_{i=0}^{j-1}(x - x_i) \prod_{i=j+2}^N
(x_i - x) i = 0 ∏ N ∣ x − x i ∣ ≤ 4 h 2 i = 0 ∏ j − 1 ( x − x i ) i = j + 2 ∏ N ( x i − x ) Now since x ≤ x j + 1 x \le x_{j+1} x ≤ x j + 1 and − x ≤ − x j -x \le -x_j − x ≤ − x j
∏ i = 0 N ∣ x − x i ∣ ≤ h 2 4 ∏ i = 0 j − 1 ( x j + 1 − x i ) ∏ i = j + 2 N ( x i − x j ) \prod_{i=0}^N |x - x_i| \le \frac{h^2}{4} \prod_{i=0}^{j-1}(x_{j+1} - x_i)
\prod_{i=j+2}^N (x_i - x_j) i = 0 ∏ N ∣ x − x i ∣ ≤ 4 h 2 i = 0 ∏ j − 1 ( x j + 1 − x i ) i = j + 2 ∏ N ( x i − x j ) Using
x j + 1 − x i = ( j − i + 1 ) h , x i − x j = ( i − j ) h x_{j+1} - x_i = (j-i+1)h, \qquad x_i - x_j = (i-j)h x j + 1 − x i = ( j − i + 1 ) h , x i − x j = ( i − j ) h the inequality
becomes
∏ i = 0 N ∣ x − x i ∣ ≤ h N + 1 4 ∏ i = 0 j − 1 ( j − i + 1 ) ∏ i = j + 2 N ( i − j ) = h N + 1 4 ( j + 1 ) ! ( N − j ) ! \prod_{i=0}^N |x - x_i| \le \frac{h^{N+1}}{4} \prod_{i=0}^{j-1}(j-i+1)
\prod_{i=j+2}^N (i - j) = \frac{h^{N+1}}{4} (j+1)! (N-j)! i = 0 ∏ N ∣ x − x i ∣ ≤ 4 h N + 1 i = 0 ∏ j − 1 ( j − i + 1 ) i = j + 2 ∏ N ( i − j ) = 4 h N + 1 ( j + 1 )! ( N − j )! Finally show that (See Assignment, for proof, see problems collection)
( j + 1 ) ! ( N − j ) ! ≤ N ! , 0 ≤ j ≤ N − 1 (j+1)! (N-j)! \le N!, \qquad 0 \le j \le N-1 ( j + 1 )! ( N − j )! ≤ N ! , 0 ≤ j ≤ N − 1 which completes the proof of the theorem.
Assume that ∣ f ( n ) ( x ) ∣ ≤ M |f^{(n)}(x)| \le M ∣ f ( n ) ( x ) ∣ ≤ M for all x x x and n n n . Then the error of
polynomial interpolation on uniformly spaced points is bounded by
∣ f ( x ) − p ( x ) ∣ ≤ M h N + 1 4 ( N + 1 ) |f(x) - p(x)| \le \frac{M h^{N+1}}{4(N+1)} ∣ f ( x ) − p ( x ) ∣ ≤ 4 ( N + 1 ) M h N + 1 and the error goes to zero as N → ∞ N \to \infty N → ∞ .
The error bound follows from the two previous theorems. As N N N increases, h h h becomes less than one at some point, and beyond this, the right hand side in the error bound will converge to zero.
Functions like cos x \cos x cos x , sin x \sin x sin x , exp ( x ) \exp(x) exp ( x ) satisfy the conditions of the above theorem. These conditions are quite strong and can be relaxed considerably. E.g., if f ( x ) = sin ( α x ) f(x) = \sin(\alpha x) f ( x ) = sin ( αx ) then ∣ f ( n ) ( x ) ∣ ≤ ∣ α ∣ n |f^{(n)}(x)| \le |\alpha|^n ∣ f ( n ) ( x ) ∣ ≤ ∣ α ∣ n and if ∣ α ∣ > 1 |\alpha| > 1 ∣ α ∣ > 1 then the derivatives can increase. If the derivatives satisfy
∣ f ( n ) ( x ) ∣ ≤ C ∣ α ∣ n |f^{(n)}(x)| \le C |\alpha|^n ∣ f ( n ) ( x ) ∣ ≤ C ∣ α ∣ n then the error estimate gives
∣ f ( x ) − p ( x ) ∣ ≤ C ( α h ) N + 1 4 ( N + 1 ) |f(x) - p(x)| \le \frac{C (\alpha h)^{N+1}}{4(N+1)} ∣ f ( x ) − p ( x ) ∣ ≤ 4 ( N + 1 ) C ( α h ) N + 1 As N N N increases, we will satisfy α h < 1 \alpha h < 1 α h < 1 and beyond this point, the right hand side goes to zero.
(TODO) The problematic case is if the derivatives grow at a factorial rate
∣ f ( n ) ( x ) ∣ ≤ C n ! |f^{(n)}(x)| \le C n! ∣ f ( n ) ( x ) ∣ ≤ C n ! so that error bound is
∣ f ( x ) − p ( x ) ∣ ≤ C N ! ( b − a ) N + 1 4 N N + 1 |f(x) - p(x)| \le \frac{C N! (b-a)^{N+1}}{4 N^{N+1}} ∣ f ( x ) − p ( x ) ∣ ≤ 4 N N + 1 CN ! ( b − a ) N + 1 which does not go to zero.
3 Difficulty of polynomial interpolation ¶ Do the polynomial approximations p N p_N p N converge to the true function f f f as N → ∞ N \to \infty N → ∞ ? The error formula seems to suggest so, due to the factor 1 ( N + 1 ) ! \frac{1}{(N+1)!} ( N + 1 )! 1 provided higher order derivatives of f f f are bounded. On uniformly spaced points, we have seen the interpolants of cos ( x ) \cos(x) cos ( x ) converge but those of the rational function 1 1 + 16 x 2 \frac{1}{1+16x^2} 1 + 16 x 2 1 do not. This must be related to the behaviour of the derivatives of these functions.
Consider y = ln ( x ) y=\ln(x) y = ln ( x ) . Then its derivatives are
y ′ = 1 x y ′ ′ = − 1 x 2 y ′ ′ ′ = 2 ! x 3 ⋮ y ( n ) = ( − 1 ) n − 1 ( n − 1 ) ! x n \begin{aligned}
y' &= \frac{1}{x} \\
y'' &= -\frac{1}{x^2} \\
y''' &= \frac{2!}{x^3} \\
&\vdots \\
y^{(n)} &= \frac{(-1)^{n-1} (n-1)!}{x^n}
\end{aligned} y ′ y ′′ y ′′′ y ( n ) = x 1 = − x 2 1 = x 3 2 ! ⋮ = x n ( − 1 ) n − 1 ( n − 1 )! Even though the curve y = ln ( x ) y=\ln(x) y = ln ( x ) looks smooth near any
value of x x x , as n n n gets large, the derivatives become very large in
size, and tend to behave like n ! n! n ! or worse.
This is the general situation; for most functions, some higher order
derivatives tend to grow as n ! n! n ! . Even for a polynomial p N ( x ) p_N(x) p N ( x ) , the
derivatives grow in size until the N N N ’th one, which is a N N ! a_N N! a N N ! , after
which they suddenly all become zero.
Example 7 (Runge phenomenon)
Consider interpolating the following two functions on [ − 1 , 1 ] [-1,1] [ − 1 , 1 ]
f 1 ( x ) = exp ( − 5 x 2 ) , f 2 ( x ) = 1 1 + 16 x 2 f_1(x) = \exp(-5x^2), \qquad f_2(x) = \frac{1}{1 + 16 x^2} f 1 ( x ) = exp ( − 5 x 2 ) , f 2 ( x ) = 1 + 16 x 2 1 We will try uniformly spaced points and Chebyshev points.
Let us first plot the two functions.
xmin, xmax = -1.0, +1.0
f1 = lambda x: exp(-5.0*x**2)
f2 = lambda x: 1.0/(1.0+16.0*x**2)
xx = linspace(xmin,xmax,100,True)
figure(figsize=(8,4))
plot(xx,f1(xx),xx,f2(xx))
legend(("$1/(1+16x^2)$", "$\\exp(-5x^2)$"));
The two functions look qualitatively similar and both are infinitely differentiable.
def interp(f,points):
xx = linspace(xmin,xmax,100,True);
ye = f(xx);
figure(figsize=(9,8))
for i in range(1,7):
N = 2*i
subplot(3,2,i)
if points == 'uniform':
x = linspace(xmin,xmax,N+1,True)
else:
theta = linspace(0,pi,N+1, True)
x = cos(theta)
y = f(x);
P = polyfit(x,y,N);
yy = polyval(P,xx);
plot(x,y,'o',xx,ye,'--',xx,yy)
axis([xmin, xmax, -1.0, +1.1])
text(-0.1,0.0,'N = '+str(N))
Interpolate f 1 ( x ) f_1(x) f 1 ( x ) on uniformly spaced points.
Interpolate f 2 ( x ) f_2(x) f 2 ( x ) on uniformly spaced points.
The above results are not good. Let us try f 2 ( x ) f_2(x) f 2 ( x ) on Chebyshev points.
What about interpolating f 1 ( x ) f_1(x) f 1 ( x ) on Chebyshev points ?
This also seems fine. So uniform points sometimes works, Chebyshev points work all the time.
4 Polynomial factor in error bound ¶ The error of polynomial interpolation is given by
f ( x ) − p N ( x ) = ω N ( x ) ( N + 1 ) ! f ( N + 1 ) ( ξ ) f(x) - p_N(x) = \frac{\omega_N(x)}{(N+1)!} f^{(N+1)}(\xi) f ( x ) − p N ( x ) = ( N + 1 )! ω N ( x ) f ( N + 1 ) ( ξ ) where
ξ ∈ I ( x 0 , x 1 , … , x N , x ) \xi \in I(x_0,x_1,\ldots,x_N,x) ξ ∈ I ( x 0 , x 1 , … , x N , x ) and
ω N ( x ) = ( x − x 0 ) … ( x − x N ) \omega_N(x) = (x-x_0)\ldots(x-x_N) ω N ( x ) = ( x − x 0 ) … ( x − x N ) Case N = 1 N=1 N = 1 .
In case of linear interpolation
ω 1 ( x ) = ( x − x 0 ) ( x − x 1 ) , h = x 1 − x 0 \omega_1(x) = (x-x_0)(x-x_1), \qquad h = x_1 - x_0 ω 1 ( x ) = ( x − x 0 ) ( x − x 1 ) , h = x 1 − x 0 Then
max x 0 ≤ x ≤ x 1 ∣ ω 1 ( x ) ∣ = h 2 4 \max_{x_0 \le x \le x_1} |\omega_1(x)| = \frac{h^2}{4} x 0 ≤ x ≤ x 1 max ∣ ω 1 ( x ) ∣ = 4 h 2 and the
interpolation error bound is
max x 0 ≤ x ≤ x 1 ∣ f ( x ) − p 1 ( x ) ∣ ≤ h 2 8 max x 0 ≤ x ≤ x 1 ∣ f ′ ′ ( x ) ∣ \max_{x_0 \le x \le x_1} |f(x) - p_1(x)| \le \frac{h^2}{8} \max_{x_0 \le x \le x_1}|
f''(x)| x 0 ≤ x ≤ x 1 max ∣ f ( x ) − p 1 ( x ) ∣ ≤ 8 h 2 x 0 ≤ x ≤ x 1 max ∣ f ′′ ( x ) ∣ Case N = 2 N=2 N = 2 .
To bound ω 2 ( x ) \omega_2(x) ω 2 ( x ) on [ x 0 , x 2 ] [x_0,x_2] [ x 0 , x 2 ] , shift the origin to x 1 x_1 x 1 so that
ω 2 ( x ) = ( x − h ) x ( x + h ) \omega_2(x) = (x-h)x(x+h) ω 2 ( x ) = ( x − h ) x ( x + h ) Near the center of the data
max x 1 − 1 2 h ≤ x ≤ x 1 + 1 2 h ∣ ω 2 ( x ) ∣ = 0.375 h 3 \max_{x_1-\half h \le x \le x_1 + \half h} |\omega_2(x)| = 0.375 h^3 x 1 − 2 1 h ≤ x ≤ x 1 + 2 1 h max ∣ ω 2 ( x ) ∣ = 0.375 h 3 whereas on the whole interval
max x 0 ≤ x ≤ x 2 ∣ ω 2 ( x ) ∣ = 2 3 9 h 3 ≈ 0.385 h 3 \max_{x_0 \le x \le x_2} |\omega_2(x)| = \frac{2\sqrt{3}}{9}h^3 \approx 0.385 h^3 x 0 ≤ x ≤ x 2 max ∣ ω 2 ( x ) ∣ = 9 2 3 h 3 ≈ 0.385 h 3 In this case, the error is of similar magnitude whether x x x is near the
center or near the edge.
Case N = 3 N=3 N = 3 .
We can shift the nodes so that they are symmetric about the origin. Then
ω 3 ( x ) = ( x 2 − 9 4 h 2 ) ( x 2 − 1 4 h 2 ) \omega_3(x) = (x^2 - \frac{9}{4}h^2) (x^2 - \frac{1}{4}h^2) ω 3 ( x ) = ( x 2 − 4 9 h 2 ) ( x 2 − 4 1 h 2 ) and
max x 1 ≤ x ≤ x 2 ∣ ω 3 ( x ) ∣ = 9 16 h 4 ≈ 0.56 h 4 \max_{x_1 \le x \le x_2}|\omega_3(x)| = \frac{9}{16}h^4 \approx 0.56 h^4 x 1 ≤ x ≤ x 2 max ∣ ω 3 ( x ) ∣ = 16 9 h 4 ≈ 0.56 h 4 max x 0 ≤ x ≤ x 3 ∣ ω 3 ( x ) ∣ = h 4 \max_{x_0 \le x \le x_3}|\omega_3(x)| = h^4 x 0 ≤ x ≤ x 3 max ∣ ω 3 ( x ) ∣ = h 4 In this case, the error
near the endpoints can be twice as large as the error near the middle.
Case N > 3 N > 3 N > 3 .
The behaviour exhibited for N = 3 N=3 N = 3 is accentuated for larger degree. For
N = 6 N=6 N = 6 ,
max x 2 ≤ x ≤ x 4 ∣ ω 3 ( x ) ∣ ≈ 12.36 h 7 , max x 0 ≤ x ≤ x 6 ∣ ω 3 ( x ) ∣ ≈ 95.8 h 7 \max_{x_2 \le x \le x_4}|\omega_3(x)| \approx 12.36 h^7, \qquad
\max_{x_0 \le x \le x_6}|\omega_3(x)| \approx 95.8 h^7 x 2 ≤ x ≤ x 4 max ∣ ω 3 ( x ) ∣ ≈ 12.36 h 7 , x 0 ≤ x ≤ x 6 max ∣ ω 3 ( x ) ∣ ≈ 95.8 h 7 and the error near the ends can be almost 8 times that near the center.
Function ω N ( x ) \omega_N(x) ω N ( x ) arising in interpolation error
For a given set of points x 0 , x 1 , … , x N ∈ [ − 1 , + 1 ] x_0, x_1, \ldots, x_N \in [-1,+1] x 0 , x 1 , … , x N ∈ [ − 1 , + 1 ] we plot the function
ω N ( x ) = ( x − x 0 ) ( x − x 1 ) … ( x − x N ) , x ∈ [ − 1 , + 1 ] \omega_N(x) = (x-x_0)(x-x_1) \ldots (x-x_N), \qquad x \in [-1,+1] ω N ( x ) = ( x − x 0 ) ( x − x 1 ) … ( x − x N ) , x ∈ [ − 1 , + 1 ] for uniformly spaced points.
def omega(x,xp):
f = 1.0
for z in xp:
f = f * (x-z)
return f
def plot_omega(x):
M = 1000
xx = linspace(-1.0,1.0,M)
f = 0*xx
for i in range(M):
f[i] = omega(xx[i],x)
plot(xx,f,'b-',x,0*x,'o')
title("N = "+str(N));
grid(True)
N = 3
x = linspace(-1.0,1.0,N+1)
plot_omega(x)
N = 4
x = linspace(-1.0,1.0,N+1)
plot_omega(x)
N = 6
x = linspace(-1.0,1.0,N+1)
plot_omega(x)
N = 20
x = linspace(-1.0,1.0,N+1)
plot_omega(x)
5 Distribution of data points ¶ The error in polynomial interpolation is
∣ f ( x ) − p N ( x ) ∣ ≤ ∣ ω N ( x ) ∣ ( N + 1 ) ! max ξ ∣ f ( N + 1 ) ( ξ ) ∣ |f(x) - p_N(x)| \le \frac{|\omega_N(x)|}{(N+1)!} \max_{\xi} |f^{(N+1)}(\xi)| ∣ f ( x ) − p N ( x ) ∣ ≤ ( N + 1 )! ∣ ω N ( x ) ∣ ξ max ∣ f ( N + 1 ) ( ξ ) ∣ where
ω N ( x ) = ( x − x 0 ) ( x − x 1 ) … ( x − x N ) \omega_N(x) = (x-x_0)(x-x_1) \ldots (x-x_N) ω N ( x ) = ( x − x 0 ) ( x − x 1 ) … ( x − x N ) For a given function f ( x ) f(x) f ( x ) , we cannot do anything about the derivative term in the error estimate. For uniformly spaced data points, ω N \omega_N ω N has large value near the end points which is also where the Runge phenomenon is observed. But we can try to minimize the magnitude of ω N ( x ) \omega_N(x) ω N ( x ) by choosing a different set of nodes for interpolation. For the following discussion, let us assume that the x i x_i x i are ordered and contained in the interval [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] .
Can we choose the nodes { x i } \{x_i\} { x i } so that
max x ∈ [ − 1 , + 1 ] ∣ ω N ( x ) ∣ \max_{x \in [-1,+1]} |\omega_N(x)| x ∈ [ − 1 , + 1 ] max ∣ ω N ( x ) ∣ is minimized ? We will show that
min { x i } max x ∈ [ − 1 , + 1 ] ∣ ω N ( x ) ∣ = 2 − N \min_{\{x_i\}} \max_{x \in [-1,+1]} |\omega_N(x)| = 2^{-N} { x i } min x ∈ [ − 1 , + 1 ] max ∣ ω N ( x ) ∣ = 2 − N and the minimum is achieved for following set of nodes
x i = cos ( 2 ( N − i ) + 1 2 N + 2 π ) , i = 0 , 1 , … , N x_i = \cos\left( \frac{2(N-i)+1}{2N+2} \pi \right), \qquad i=0,1,\ldots,N x i = cos ( 2 N + 2 2 ( N − i ) + 1 π ) , i = 0 , 1 , … , N These are called Chebyshev points of first kind .
6 Chebyshev polynomials ¶ The Chebyshev polynomials are defined on the interval [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] and the
first few polynomials are
T 0 ( x ) = 1 T 1 ( x ) = x \begin{aligned}
T_0(x) &=& 1 \\
T_1(x) &=& x
\end{aligned} T 0 ( x ) T 1 ( x ) = = 1 x The remaining polynomials can be defined by recursion
T n + 1 ( x ) = 2 x T n ( x ) − T n − 1 ( x ) T_{n+1}(x) = 2 x T_n(x) - T_{n-1}(x) T n + 1 ( x ) = 2 x T n ( x ) − T n − 1 ( x ) so that
T 2 ( x ) = 2 x 2 − 1 T 3 ( x ) = 4 x 3 − 3 x T 4 ( x ) = 8 x 4 − 8 x 2 + 1 \begin{aligned}
T_2(x) =& 2x^2 - 1 \\
T_3(x) =& 4x^3 - 3x \\
T_4(x) =& 8x^4 - 8x^2 + 1
\end{aligned} T 2 ( x ) = T 3 ( x ) = T 4 ( x ) = 2 x 2 − 1 4 x 3 − 3 x 8 x 4 − 8 x 2 + 1 In Python, we can use function recursion to compute the Chebyshev polynomials.
We can write x ∈ [ − 1 , + 1 ] x \in [-1,+1] x ∈ [ − 1 , + 1 ] in terms of an angle θ ∈ [ 0 , π ] \theta \in [0,\pi] θ ∈ [ 0 , π ]
x = cos θ x=\cos\theta x = cos θ Properties of Chebyshev polynomials
T n ( x ) T_n(x) T n ( x ) is a polynomial of degree n n n .
T n ( cos θ ) = cos ( n θ ) T_n(\cos\theta) = \cos(n\theta) T n ( cos θ ) = cos ( n θ ) . (Fourier cosine series)
T n ( x ) = cos ( n cos − 1 ( x ) ) T_n(x) = \cos(n\cos^{-1}(x)) T n ( x ) = cos ( n cos − 1 ( x ))
∣ T n ( x ) ∣ ≤ 1 |T_n(x)| \le 1 ∣ T n ( x ) ∣ ≤ 1
Extrema
T n ( cos j π n ) = ( − 1 ) j , 0 ≤ j ≤ n
T_n\left(\cos\frac{j\pi}{n}\right) = (-1)^j, \qquad 0 \le j \le n T n ( cos n jπ ) = ( − 1 ) j , 0 ≤ j ≤ n Roots
T n ( cos 2 j + 1 2 n π ) = 0 , 0 ≤ j ≤ n − 1
T_n\left( \cos \frac{2j+1}{2n}\pi \right) = 0, \qquad 0 \le j \le n-1 T n ( cos 2 n 2 j + 1 π ) = 0 , 0 ≤ j ≤ n − 1 We prove by contradiction. Suppose that
∣ p ( x ) ∣ < 2 1 − n , ∀ ∣ x ∣ ≤ 1 |p(x)| < 2^{1-n}, \qquad \forall |x| \le 1 ∣ p ( x ) ∣ < 2 1 − n , ∀∣ x ∣ ≤ 1 Define
q ( x ) = 2 1 − n T n ( x ) q(x) = 2^{1-n} T_n(x) q ( x ) = 2 1 − n T n ( x ) The extrema of T n T_n T n are at
x i = cos ( i π n ) , 0 ≤ i ≤ n x_i = \cos(\frac{i\pi}{n}), \quad 0 \le i \le n x i = cos ( n iπ ) , 0 ≤ i ≤ n and
T n ( x i ) = ( − 1 ) i so that q ( x i ) = 2 1 − n ( − 1 ) i T_n(x_i) = (-1)^i \qquad \textrm{so that} \qquad q(x_i) = 2^{1-n} (-1)^i T n ( x i ) = ( − 1 ) i so that q ( x i ) = 2 1 − n ( − 1 ) i Now
( − 1 ) i p ( x i ) ≤ ∣ p ( x i ) ∣ < 2 1 − n = ( − 1 ) i q ( x i ) (-1)^i p(x_i) \le |p(x_i)| < 2^{1-n} = (-1)^i q(x_i) ( − 1 ) i p ( x i ) ≤ ∣ p ( x i ) ∣ < 2 1 − n = ( − 1 ) i q ( x i ) so that
( − 1 ) i [ q ( x i ) − p ( x i ) ] > 0 , 0 ≤ i ≤ n (-1)^i [q(x_i) - p(x_i)] > 0, \qquad 0 \le i \le n ( − 1 ) i [ q ( x i ) − p ( x i )] > 0 , 0 ≤ i ≤ n q − p q-p q − p changes sign n n n times, so it must have atleast n n n roots. But q − p q-p q − p is of degree n − 1 n-1 n − 1 , and hence we get a contradiction.
7 Optimal nodes ¶ Since ω N ( x ) \omega_N(x) ω N ( x ) is a monic polynomial of degree N + 1 N+1 N + 1 , we know from
previous theorem that
max − 1 ≤ x ≤ + 1 ∣ ω N ( x ) ∣ ≥ 2 − N \max_{-1 \le x \le +1} |\omega_N(x)| \ge 2^{-N} − 1 ≤ x ≤ + 1 max ∣ ω N ( x ) ∣ ≥ 2 − N Can we choose the
x i x_i x i so that the minimum value of 2 − N 2^{-N} 2 − N is achieved ?
If x i x_i x i are the N + 1 N+1 N + 1 distinct roots of T N + 1 ( x ) T_{N+1}(x) T N + 1 ( x ) , then
ω N ( x ) = 2 − N T N + 1 ( x ) \omega_N(x) = 2^{-N} T_{N+1}(x) ω N ( x ) = 2 − N T N + 1 ( x ) so that
max − 1 ≤ x ≤ + 1 ∣ ω N ( x ) ∣ = 2 − N max − 1 ≤ x ≤ + 1 ∣ T N + 1 ( x ) ∣ = 2 − N \max_{-1 \le x \le +1} |\omega_N(x)| = 2^{-N} \max_{-1 \le x \le +1} |T_{N+1}(x)| =
2^{-N} − 1 ≤ x ≤ + 1 max ∣ ω N ( x ) ∣ = 2 − N − 1 ≤ x ≤ + 1 max ∣ T N + 1 ( x ) ∣ = 2 − N The optimal nodes are given by
x i = cos ( 2 i + 1 2 N + 2 π ) , 0 ≤ i ≤ N x_i = \cos\left( \frac{2i+1}{2N+2} \pi \right), \qquad 0 \le i \le N x i = cos ( 2 N + 2 2 i + 1 π ) , 0 ≤ i ≤ N In terms of θ variable, the spacing between these nodes is
θ i + 1 − θ i = π N + 1 \theta_{i+1} - \theta_i = \frac{\pi}{N+1} θ i + 1 − θ i = N + 1 π The roots of degree n n n Chebyshev polynomial T n ( x ) T_n(x) T n ( x ) are
x i = cos ( 2 i + 1 2 n π ) , i = 0 , 1 , … , n − 1 x_i = \cos\left( \frac{2i+1}{2n} \pi \right), \qquad i=0,1,\ldots,n-1 x i = cos ( 2 n 2 i + 1 π ) , i = 0 , 1 , … , n − 1 The roots are shown below for n = 10 , 11 , … , 19 n=10,11,\ldots,19 n = 10 , 11 , … , 19 .
c = 1
for n in range(10,20):
j = linspace(0,n-1,n)
theta = (2*j+1)*pi/(2*n)
x = cos(theta)
y = 0*x
subplot(10,1,c)
plot(x,y,'.')
xticks([]); yticks([])
ylabel(str(n))
c += 1
Note that the roots are clustered at the end points.
If the nodes { x i } \{x_i\} { x i } are the N + 1 N+1 N + 1 roots of the Chebyshev polynomial
T N + 1 T_{N+1} T N + 1 , then the error formula for polynomial interpolation in the
interval [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] is
∣ f ( x ) − p N ( x ) ∣ ≤ 1 2 N ( N + 1 ) ! max ∣ t ∣ ≤ 1 ∣ f ( N + 1 ) ( t ) ∣ |f(x) - p_N(x)| \le \frac{1}{2^N (N+1)!} \max_{|t| \le 1}|f^{(N+1)}(t)| ∣ f ( x ) − p N ( x ) ∣ ≤ 2 N ( N + 1 )! 1 ∣ t ∣ ≤ 1 max ∣ f ( N + 1 ) ( t ) ∣ Note that
x 0 = cos ( 1 2 N + 2 π ) < 1 , x N = cos ( 2 N + 1 2 N + 2 π ) > − 1 x_0 = \cos\left( \frac{1}{2N+2} \pi \right) < 1, \qquad x_N = \cos\left( \frac{2N+1} {2N+2} \pi \right) > -1 x 0 = cos ( 2 N + 2 1 π ) < 1 , x N = cos ( 2 N + 2 2 N + 1 π ) > − 1 and the endpoints of [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] are not nodes. The nodes are ordered as x 0 > x 1 > … > x N x_0 > x_1 > \ldots > x_N x 0 > x 1 > … > x N . We can reorder them by defining the x i x_i x i as in (63) .
In practice, we dont use the Chebyshev nodes as derived above. The important point is how the points are clustered near the ends of the interval. This type of clustering can be achieved by other node sets. If we want N + 1 N+1 N + 1 nodes, then divide [ 0 , π ] [0,\pi] [ 0 , π ] into N N N equal intervals so that
θ i = ( N − i ) π N , i = 0 , 1 , … , N \theta_i = \frac{(N-i)\pi}{N}, \qquad i=0,1,\ldots,N θ i = N ( N − i ) π , i = 0 , 1 , … , N and then the nodes are given by
x i = cos θ i x_i = \cos\theta_i x i = cos θ i These are called Chebyshev points of second kind .
FIXME
Chebyshev points are defined in the interval [ − 1 , + 1 ] [-1,+1] [ − 1 , + 1 ] . They are obtained as projections of uniformly spaced points on the unit circle onto the x x x -axis. For any N ≥ 2 N \ge 2 N ≥ 2 they are defined as follows
θ i = π i N − 1 , x i = cos ( θ i ) , i = 0 , 1 , … , N − 1 \theta_i = \frac{\pi i}{N-1}, \qquad x_i = \cos(\theta_i), \qquad i=0,1,\ldots,N-1 θ i = N − 1 πi , x i = cos ( θ i ) , i = 0 , 1 , … , N − 1 t = linspace(0,pi,1000)
xx, yy = cos(t), sin(t)
plot(xx,yy)
N = 10
theta = linspace(0,pi,N)
plot(cos(theta),sin(theta),'o')
for i in range(N):
x1 = [cos(theta[i]), cos(theta[i])]
y1 = [0.0, sin(theta[i])]
plot(x1,y1,'k--',cos(theta[i]),0,'sr')
axis([-1.1, 1.1, 0.0, 1.1])
axis('equal'), title(str(N)+' Chebyshev points');
Below, we compare the polynomial ω N ( x ) \omega_N(x) ω N ( x ) for uniform and Chebyshev points for N = 16 N=16 N = 16 .
M = 1000
xx = linspace(-1.0,1.0,M)
N = 16
xu = linspace(-1.0,1.0,N+1) # uniform points
xc = cos(linspace(0.0,pi,N+1)) # chebyshev points
fu = 0*xx
fc = 0*xx
for i in range(M):
fu[i] = omega(xx[i],xu)
fc[i] = omega(xx[i],xc)
plot(xx,fu,'b-',xx,fc,'r-')
legend(("Uniform","Chebyshev"))
grid(True), title("N = "+str(N));