Consider a partition of [ a , b ] [a,b] [ a , b ] into a grid
a = x 0 < x 1 < … < x N = b a = x_0 < x_1 < \ldots < x_N = b a = x 0 < x 1 < … < x N = b The points x j x_j x j are called knots, breakpoints or nodes. Let p ( x ) p(x) p ( x ) be a polynomial on each of the subintervals
[ x 0 , x 1 ] , [ x 1 , x 2 ] , … , [ x N − 1 , x N ] [x_0,x_1], \quad [x_1,x_2], \quad \ldots, [x_{N-1},x_N] [ x 0 , x 1 ] , [ x 1 , x 2 ] , … , [ x N − 1 , x N ] i.e.,
p ( x ) ∈ P r , x ∈ [ x i , x i + 1 ] p(x) \in \poly_r, \qquad x \in [x_i,x_{i+1}] p ( x ) ∈ P r , x ∈ [ x i , x i + 1 ] Note that p ( x ) p(x) p ( x ) need not be a polynomial in [ a , b ] [a, b] [ a , b ] . We say that p ( x ) p(x) p ( x ) is a piecewise polynomial of degree r r r if the degree of p ( x ) p(x) p ( x ) is ≤ r \le r ≤ r on each of the sub-intervals. In general, no restrictions on the continuity of p ( x ) p(x) p ( x ) or its derivatives at the end points of the sub-intervals might exist.
1 Piecewise linear interpolation ¶ Let us demand the we want a continuous approximation. We are given the function values y i = f ( x i ) y_i = f(x_i) y i = f ( x i ) at all the nodes of our grid. We can construct the polynomial of each sub-interval
p ( x ) = x − x i + 1 x i − x i + 1 y i + x − x i x i + 1 − x i y i + 1 , x i ≤ x ≤ x i + 1 p(x) = \frac{x - x_{i+1}}{x_i - x_{i+1}} y_i + \frac{x - x_i}{x_{i+1} - x_i} y_{i+1},
\qquad x_i \le x \le x_{i+1} p ( x ) = x i − x i + 1 x − x i + 1 y i + x i + 1 − x i x − x i y i + 1 , x i ≤ x ≤ x i + 1 Clearly, this is linear and continuous. On each interval we know the interpolation error estimate
max x ∈ [ x i , x i + 1 ] ∣ f ( x ) − p ( x ) ∣ ≤ h i + 1 2 2 max t ∈ [ x i , x i + 1 ] ∣ f ′ ′ ( t ) ∣ , h i + 1 = x i + 1 − x i \max_{x \in [x_i,x_{i+1}]} |f(x) - p(x)| \le \frac{h_{i+1}^2}{2} \max_{t \in
[x_i,x_{i+1}]} |f''(t)|, \qquad h_{i+1} = x_{i+1} - x_i x ∈ [ x i , x i + 1 ] max ∣ f ( x ) − p ( x ) ∣ ≤ 2 h i + 1 2 t ∈ [ x i , x i + 1 ] max ∣ f ′′ ( t ) ∣ , h i + 1 = x i + 1 − x i from which we get the global error
max x ∈ [ x 0 , x N ] ∣ f ( x ) − p ( x ) ∣ ≤ h 2 2 max t ∈ [ x 0 , x N ] ∣ f ′ ′ ( t ) ∣ , h = max i h i \max_{x \in [x_0,x_N]} |f(x) - p(x)| \le \frac{h^2}{2} \max_{t \in [x_0,x_N]} |f''(t)|,
\qquad h = \max_i h_i x ∈ [ x 0 , x N ] max ∣ f ( x ) − p ( x ) ∣ ≤ 2 h 2 t ∈ [ x 0 , x N ] max ∣ f ′′ ( t ) ∣ , h = i max h i Clearly, we have convergence as N → ∞ N \to \infty N → ∞ , h → 0 h \to 0 h → 0 . The convergence is quadratic; if h h h becomes h / 2 h/2 h /2 , the error reduces by a factor of 1 / 4 1/4 1/4 .
The polynomial degree is fixed and convergence is achieved because the spacing between the grid points becomes smaller. Note that we only require some condition on the second derivative of the function f f f while in the global interpolation problem, we required conditions on higher derivatives, which can be difficult to satisfy in some situations.
We have written down the polynomial in a piecewise manner but we can also write a global view of the polynomial by defining some basis functions
ϕ i ( x ) = { x − x i − 1 x i − x i − 1 x i − 1 ≤ x ≤ x i x i + 1 − x x i + 1 − x i x i ≤ x ≤ x i + 1 0 otherwise \phi_i(x) = \begin{cases}
\frac{x - x_{i-1}}{x_i - x_{i-1}} & x_{i-1} \le x \le x_i \\
\frac{x_{i+1} - x}{x_{i+1} - x_i} & x_i \le x \le x_{i+1} \\
0 & \textrm{otherwise}
\end{cases} ϕ i ( x ) = ⎩ ⎨ ⎧ x i − x i − 1 x − x i − 1 x i + 1 − x i x i + 1 − x 0 x i − 1 ≤ x ≤ x i x i ≤ x ≤ x i + 1 otherwise These are triangular hat functions with compact support. Then the piecewise linear approximation is given by
p ( x ) = ∑ i = 0 N y i ϕ i ( x ) p(x) = \sum_{i=0}^N y_i \phi_i(x) p ( x ) = i = 0 ∑ N y i ϕ i ( x ) 1.2 An adaptive algorithm ¶ We can use the local error estimate to improve the approximation. We have to first estimate the error in the an interval I i = [ x i − 1 , x i ] I_i = [x_{i-1},x_i] I i = [ x i − 1 , x i ] by some finite difference formula H i H_i H i . If we want the error to be ≤ ϵ \le \epsilon ≤ ϵ , then we must choose h i h_i h i so that
h i 2 8 H i ≤ ϵ \frac{h_i^2}{8} H_i \le \epsilon 8 h i 2 H i ≤ ϵ If in some interval, it happens that
h i 2 8 H i > ϵ \frac{h_i^2}{8} H_i > \epsilon 8 h i 2 H i > ϵ divide I i I_i I i into two intervals
[ x i − 1 1 2 ( x i − 1 + x i ) ] [ 1 2 ( x i − 1 + x i ) , x i ] \left[ x_{i-1} \half(x_{i-1}+x_i) \right] \qquad \left[ \half(x_{i-1}+x_i), x_i \right] [ x i − 1 2 1 ( x i − 1 + x i ) ] [ 2 1 ( x i − 1 + x i ) , x i ] We can start with a uniform grid of N N N intervals and apply the above algorithm. The code finds the interval with the maximum error and divides it into two intervals. The error in the interval [ x i − 1 , x i ] [x_{i-1},x_i] [ x i − 1 , x i ] is estimated by fitting a cubic polynomial to the data x i − 2 , x i − 1 , x i , x i + 1 x_{i-2},x_{i-1},x_i,x_{i+1} x i − 2 , x i − 1 , x i , x i + 1 using polyfit
and finding the maximum value of its second derivative at x i − 1 , x i x_{i-1},x_i x i − 1 , x i to estimate H i H_i H i . P = polyfit(x,f,3)
returns a polynomial in the form
P [ 0 ] ∗ x 3 + P [ 1 ] ∗ x 2 + P [ 2 ] ∗ x + P [ 3 ] P[0]*x^3 + P[1]*x^2 + P[2]*x + P[3] P [ 0 ] ∗ x 3 + P [ 1 ] ∗ x 2 + P [ 2 ] ∗ x + P [ 3 ] whose second derivative is
6 ∗ P [ 0 ] ∗ x + 2 ∗ P [ 1 ] 6*P[0]*x + 2*P[1] 6 ∗ P [ 0 ] ∗ x + 2 ∗ P [ 1 ] Modify the program by changing the error estimate as follows. Estimate second derivative at x i − 1 x_{i-1} x i − 1 by interpolating a quadratic to the data x i − 2 , x i − 1 , x i x_{i-2},x_{i-1},x_i x i − 2 , x i − 1 , x i and taking its second derivative. Similarly, at x i x_i x i , find a quadratic polynomial to the data x i − 1 , x i , x i + 1 x_{i-1},x_i,x_{i+1} x i − 1 , x i , x i + 1 and find its second derivative. Then estimate the maximum value of second derivative in the interval as the maximum of the derivatives at the end
points.
Let us try to approximate
f ( x ) = exp ( − 100 ( x − 1 / 2 ) 2 ) sin ( 4 π x ) , x ∈ [ 0 , 1 ] f(x) = \exp(-100(x-1/2)^2) \sin(4 \pi x), \qquad x \in [0,1] f ( x ) = exp ( − 100 ( x − 1/2 ) 2 ) sin ( 4 π x ) , x ∈ [ 0 , 1 ] with piecewise linear approximation.
xmin, xmax = 0.0, 1.0
fun = lambda x: exp(-100*(x-0.5)**2) * sin(4*pi*x)
Here is the initial approximation.
N = 10 # number of initial points
x = linspace(xmin,xmax,N)
f = fun(x)
ne = 100
xe = linspace(xmin,xmax,ne)
fe = fun(xe)
fig, ax = subplots()
line1, = ax.plot(xe,fe,'-',linewidth=2)
line2, = ax.plot(x,f,'or--',linewidth=2)
ax.set_title('Initial approximation, N = ' + str(N));
The next function performs uniform and adaptive refinement.
def adapt(x,f,nadapt=100,err=1.0e-2,mode=1):
N = len(x)
for n in range(nadapt):
h = x[1:] - x[0:-1]
H = zeros(N-1)
for j in range(1,N-2): # skip first and last element
P = polyfit(x[j-1:j+3], f[j-1:j+3], 3)
H[j] = max(abs(6*P[0]*x[j:j+2] + 2*P[1]))
elem_err = (1.0/8.0) * h**2 * abs(H)
i = argmax(elem_err); current_err = elem_err[i]
if current_err < err:
print('Satisfied error tolerance, N = ' + str(N))
return x,f
if mode == 0:
x = linspace(xmin,xmax,2*N)
f = fun(x)
else:
x = concatenate([x[0:i+1], [0.5*(x[i]+x[i+1])], x[i+1:]])
f = concatenate([f[0:i+1], [fun(x[i+1])], f[i+1:]])
N = len(x)
return x,f
Let us first try uniform refinement
Satisfied error tolerance, N = 80
and adaptive refinement.
Satisfied error tolerance, N = 31
Here is an animation of adaptive refinement.
def fplot(frame_number,x,f):
x1, f1 = adapt(x,f,frame_number)
line2.set_data(x1,f1)
ax.set_title('N = '+str(len(x1)))
return line2,
anim = animation.FuncAnimation(fig, fplot, frames=22, fargs=[x,f], repeat=False)
HTML(anim.to_jshtml())
The adaptive algorithm puts new points in regions of large gradient, where the resolution is not sufficient, and does not add anything in other regions.
2 Piecewise quadratic interpolation ¶ Consider a grid
x 0 < x 1 < … < x N x_0 < x_1 < \ldots < x_N x 0 < x 1 < … < x N and define the mid-points of the intervals
x i − 1 2 = 1 2 ( x i − 1 + x i ) , i = 1 , 2 , … , N x_\imh = \half(x_{i-1} + x_i), \qquad i=1,2,\ldots,N x i − 2 1 = 2 1 ( x i − 1 + x i ) , i = 1 , 2 , … , N We are given the information
y i = f ( x i ) , i = 0 , 1 , … , N y_i = f(x_i), \qquad i=0,1,\ldots,N y i = f ( x i ) , i = 0 , 1 , … , N y i − 1 2 = f ( x i − 1 2 ) , i = 1 , 2 , … , N y_\imh = f(x_\imh), \qquad i=1,2,\ldots,N y i − 2 1 = f ( x i − 2 1 ) , i = 1 , 2 , … , N In the interval I i = [ x i − 1 , x i ] I_i = [x_{i-1},x_i] I i = [ x i − 1 , x i ] we know three values
y i − 1 , y i − 1 2 , y i y_{i-1}, y_\imh, y_i y i − 1 , y i − 2 1 , y i and we can determine a quadratic polynomial the interpolates these values
p ( x ) = y i − 1 ϕ 0 ( x − x i − 1 h i ) + y i − 1 2 ϕ 1 ( x − x i − 1 h i ) + y i ϕ 2 ( x − x i − 1 h i ) p(x) = y_{i-1} \phi_0\left(\frac{x-x_{i-1}}{h_i}\right) + y_\imh \phi_1\left(\frac{x-
x_{i-1}}{h_i}\right) + y_i \phi_2\left(\frac{x-x_{i-1}}{h_i}\right) p ( x ) = y i − 1 ϕ 0 ( h i x − x i − 1 ) + y i − 2 1 ϕ 1 ( h i x − x i − 1 ) + y i ϕ 2 ( h i x − x i − 1 ) where ϕ 0 , ϕ 1 , ϕ 2 \phi_0,\phi_1,\phi_2 ϕ 0 , ϕ 1 , ϕ 2 are Lagrange polynomials given by
ϕ 0 ( ξ ) = 2 ( ξ − 1 2 ) ( ξ − 1 ) , ϕ 1 ( ξ ) = 4 ξ ( ξ − 1 ) , ϕ 2 ( ξ ) = 2 ξ ( ξ − 1 2 ) \phi_0(\xi) = 2(\xi - \shalf)(\xi - 1), \qquad \phi_1(\xi) = 4\xi(\xi-1), \qquad \phi_2(\xi)
= 2\xi(\xi-\shalf) ϕ 0 ( ξ ) = 2 ( ξ − 2 1 ) ( ξ − 1 ) , ϕ 1 ( ξ ) = 4 ξ ( ξ − 1 ) , ϕ 2 ( ξ ) = 2 ξ ( ξ − 2 1 ) Note that the quadratic interpolation is continuous but may not be differentiable.
3 Piecewise degree k k k interpolation ¶ Consider a grid
a = x 0 < x 1 < … < x N = b a = x_0 < x_1 < \ldots < x_N = b a = x 0 < x 1 < … < x N = b In each interval [ x i − 1 , x i ] [x_{i-1},x_i] [ x i − 1 , x i ] we want to construct a degree k k k polynomial approximation, so we need to know k + 1 k+1 k + 1 function values at k + 1 k+1 k + 1 distinct points
x i − 1 = x i , 0 < x i , 1 < … < x i , k = x i x_{i-1} = x_{i,0} < x_{i,1} < \ldots < x_{i,k} = x_i x i − 1 = x i , 0 < x i , 1 < … < x i , k = x i Then the degree k k k polynomial is given by
p ( x ) = ∑ j = 0 k f ( x i , j ) ϕ j ( x − x i − 1 h i ) p(x) = \sum_{j=0}^k f(x_{i,j}) \phi_j\left( \frac{x-x_{i-1}}{h_i} \right) p ( x ) = j = 0 ∑ k f ( x i , j ) ϕ j ( h i x − x i − 1 ) where ϕ j : [ 0 , 1 ] → R \phi_j : [0,1] \to \re ϕ j : [ 0 , 1 ] → R are Lagrange polynomials with the property
ϕ j ( ξ l ) = δ j l , 0 ≤ j , l ≤ k , where ξ l = x i , l − x i − 1 h i \phi_j(\xi_l) = \delta_{jl}, \quad 0 \le j,l \le k, \qquad \textrm{where} \quad \xi_l =
\frac{x_{i,l} - x_{i-1}}{h_i} ϕ j ( ξ l ) = δ j l , 0 ≤ j , l ≤ k , where ξ l = h i x i , l − x i − 1 Since we chose the internal nodes such that x i , 0 = x i − 1 x_{i,0} = x_{i-1} x i , 0 = x i − 1 and x i , k = x i x_{i,k} = x_i x i , k = x i , the piecewise polynomial is continuous, but may not be differentiable.
4 Error estimate in Sobolev spaces ¶ The estimate we have derived required functions to be differentiable. Here we use weaker smoothness properties.
Let f ∈ H 2 ( 0 , 1 ) f \in H^2(0,1) f ∈ H 2 ( 0 , 1 ) and let p ( x ) p(x) p ( x ) be the piecewise linear interpolant of f f f . Then
∥ f − p ∥ L 2 ( 0 , 1 ) ≤ C h 2 ∥ f ′ ′ ∥ L 2 ( 0 , 1 ) \norm{f - p}_{L^2(0,1)} \le C h^2 \norm{f''}_{L^2(0,1)} ∥ f − p ∥ L 2 ( 0 , 1 ) ≤ C h 2 ∥ f ′′ ∥ L 2 ( 0 , 1 ) ∥ f ′ − p ′ ∥ L 2 ( 0 , 1 ) ≤ C h ∥ f ′ ′ ∥ L 2 ( 0 , 1 ) \norm{f' - p'}_{L^2(0,1)} \le C h \norm{f''}_{L^2(0,1)} ∥ f ′ − p ′ ∥ L 2 ( 0 , 1 ) ≤ C h ∥ f ′′ ∥ L 2 ( 0 , 1 ) Since C ∞ ( 0 , 1 ) \cts^\infty(0,1) C ∞ ( 0 , 1 ) is dense in H 2 ( 0 , 1 ) H^2(0,1) H 2 ( 0 , 1 ) , it is enough to prove the error estimate for functions f ∈ C ∞ ( 0 , 1 ) f \in \cts^\infty(0,1) f ∈ C ∞ ( 0 , 1 ) .
(1) If x ∈ [ x j , x j + 1 ] x \in [x_j, x_{j+1}] x ∈ [ x j , x j + 1 ] then
f ( x ) − p ( x ) = f ( x ) − [ f ( x j ) + f ( x j + 1 ) − f ( x j ) x j + 1 − x j ( x − x j ) ] = ∫ x j x f ′ ( t ) d t − x − x j x j + 1 − x j ∫ x j x j + 1 f ′ ( t ) d t = ( x − x j ) f ′ ( x j + θ x ) − ( x − x j ) f ′ ( x j + θ j ) , 0 ≤ θ x ≤ x − x j 0 ≤ θ j ≤ h j = ( x − x j ) ∫ x j + θ j x j + θ x f ′ ′ ( t ) d t \begin{aligned}
f(x) - p(x)
&= f(x) - \left[ f(x_j) + \frac{f(x_{j+1}) - f(x_j)}{x_{j+1} - x_j} (x - x_j)
\right] \\
&= \int_{x_j}^x f'(t) \ud t - \frac{x - x_j}{x_{j+1} - x_j} \int_{x_j}^{x_{j+1}} f'(t)
\ud t \\
&= (x-x_j) f'(x_j + \theta_x) - (x-x_j) f'(x_j + \theta_j), \quad {0 \le \theta_x \le x - x_j
\atop 0 \le \theta_j \le h_j} \\
&= (x-x_j) \int_{x_j + \theta_j}^{x_j + \theta_x} f''(t) \ud t
\end{aligned} f ( x ) − p ( x ) = f ( x ) − [ f ( x j ) + x j + 1 − x j f ( x j + 1 ) − f ( x j ) ( x − x j ) ] = ∫ x j x f ′ ( t ) d t − x j + 1 − x j x − x j ∫ x j x j + 1 f ′ ( t ) d t = ( x − x j ) f ′ ( x j + θ x ) − ( x − x j ) f ′ ( x j + θ j ) , 0 ≤ θ j ≤ h j 0 ≤ θ x ≤ x − x j = ( x − x j ) ∫ x j + θ j x j + θ x f ′′ ( t ) d t Squaring both sides and using ∣ x − x j ∣ ≤ h j |x-x_j| \le h_j ∣ x − x j ∣ ≤ h j
∣ f ( x ) − p ( x ) ∣ 2 ≤ h j 2 [ ∫ x j + θ j x j + θ x f ′ ′ ( t ) d t ] 2 ≤ h j 2 [ ∫ x j x j + 1 ∣ f ′ ′ ( t ) ∣ d t ] 2 ≤ h j 3 ∫ x j x j + 1 ∣ f ′ ′ ( t ) ∣ 2 d t by Cauchy-Schwarz inequality \begin{aligned}
|f(x) - p(x)|^2
&\le h_j^2 \left[ \int_{x_j + \theta_j}^{x_j + \theta_x} f''(t) \ud t
\right]^2 \\
&\le h_j^2 \left[ \int_{x_j}^{x_{j+1}} |f''(t)| \ud t \right]^2 \\
&\le h_j^3 \int_{x_j}^{x_{j+1}} |f''(t)|^2 \ud t \quad \textrm{by Cauchy-Schwarz
inequality}
\end{aligned} ∣ f ( x ) − p ( x ) ∣ 2 ≤ h j 2 [ ∫ x j + θ j x j + θ x f ′′ ( t ) d t ] 2 ≤ h j 2 [ ∫ x j x j + 1 ∣ f ′′ ( t ) ∣ d t ] 2 ≤ h j 3 ∫ x j x j + 1 ∣ f ′′ ( t ) ∣ 2 d t by Cauchy-Schwarz inequality Integrating on [ x j , x j + 1 ] [x_j,x_{j+1}] [ x j , x j + 1 ]
∫ x j x j + 1 ∣ f ( x ) − p ( x ) ∣ 2 d x ≤ h j 4 ∫ x j x j + 1 ∣ f ′ ′ ( t ) ∣ 2 d t \int_{x_j}^{x_{j+1}} |f(x) - p(x)|^2 \ud x \le h_j^4 \int_{x_j}^{x_{j+1}} |f''(t)|^2
\ud t ∫ x j x j + 1 ∣ f ( x ) − p ( x ) ∣ 2 d x ≤ h j 4 ∫ x j x j + 1 ∣ f ′′ ( t ) ∣ 2 d t Adding such an inequality from all the intervals
∥ f − p ∥ L 2 ( 0 , 1 ) 2 ≤ ∑ j h j 4 ∫ x j x j + 1 ∣ f ′ ′ ( t ) ∣ 2 d t ≤ h 4 ∑ j ∫ x j x j + 1 ∣ f ′ ′ ( t ) ∣ 2 d t = h 4 ∥ f ′ ′ ∥ L 2 ( 0 , 1 ) 2 \begin{aligned}
\norm{f-p}_{L^2(0,1)}^2
&\le \sum_j h_j^4 \int_{x_j}^{x_{j+1}} |f''(t)|^2 \ud t \\
&\le h^4 \sum_j \int_{x_j}^{x_{j+1}} |f''(t)|^2 \ud t \\
&= h^4 \norm{f''}_{L^2(0,1)}^2
\end{aligned} ∥ f − p ∥ L 2 ( 0 , 1 ) 2 ≤ j ∑ h j 4 ∫ x j x j + 1 ∣ f ′′ ( t ) ∣ 2 d t ≤ h 4 j ∑ ∫ x j x j + 1 ∣ f ′′ ( t ) ∣ 2 d t = h 4 ∥ f ′′ ∥ L 2 ( 0 , 1 ) 2 we get the first error estimate.
(2) The error in derivative is
f ′ ( x ) − p ′ ( x ) = f ′ ( x ) − f ( x j + 1 ) − f ( x j ) h j = 1 h j ∫ x j x j + 1 [ f ′ ( x ) − f ′ ( t ) ] d t = 1 h j ∫ x j x j + 1 ∫ t x f ′ ′ ( y ) d y d t \begin{aligned}
f'(x) - p'(x)
&= f'(x) - \frac{f(x_{j+1}) - f(x_j)}{h_j} \\
&= \frac{1}{h_j} \int_{x_j}^{x_{j+1}}[f'(x) - f'(t)]\ud t \\
&= \frac{1}{h_j} \int_{x_j}^{x_{j+1}} \int_t^x f''(y) \ud y \ud t
\end{aligned} f ′ ( x ) − p ′ ( x ) = f ′ ( x ) − h j f ( x j + 1 ) − f ( x j ) = h j 1 ∫ x j x j + 1 [ f ′ ( x ) − f ′ ( t )] d t = h j 1 ∫ x j x j + 1 ∫ t x f ′′ ( y ) d y d t Squaring both sides
∣ f ′ ( x ) − p ′ ( x ) ∣ 2 = 1 h j 2 ( ∫ x j x j + 1 ∫ t x f ′ ′ ( y ) d y d t ) 2 ≤ 1 h j 2 ( ∫ x j x j + 1 ∫ t x ∣ f ′ ′ ( y ) ∣ d y d t ) 2 ≤ 1 h j 2 ( ∫ x j x j + 1 ∫ x j x j + 1 ∣ f ′ ′ ( y ) ∣ d y d t ) 2 = ( ∫ x j x j + 1 ∣ f ′ ′ ( y ) ∣ d y ) 2 ≤ h j ∫ x j x j + 1 ∣ f ′ ′ ( y ) ∣ 2 d y by Cauchy-Schwarz inequality \begin{aligned}
|f'(x) - p'(x)|^2
&= \frac{1}{h_j^2} \left( \int_{x_j}^{x_{j+1}} \int_t^x f''(y) \ud y
\ud t \right)^2 \\
&\le \frac{1}{h_j^2} \left( \int_{x_j}^{x_{j+1}} \int_t^x |f''(y)| \ud y \ud t \right)^2
\\
&\le \frac{1}{h_j^2} \left( \int_{x_j}^{x_{j+1}} \int_{x_j}^{x_{j+1}} |f''(y)| \ud y
\ud t \right)^2 \\
&= \left( \int_{x_j}^{x_{j+1}} |f''(y)| \ud y \right)^2 \\
&\le h_j \int_{x_j}^{x_{j+1}} |f''(y)|^2 \ud y \quad \textrm{by Cauchy-Schwarz
inequality}
\end{aligned} ∣ f ′ ( x ) − p ′ ( x ) ∣ 2 = h j 2 1 ( ∫ x j x j + 1 ∫ t x f ′′ ( y ) d y d t ) 2 ≤ h j 2 1 ( ∫ x j x j + 1 ∫ t x ∣ f ′′ ( y ) ∣ d y d t ) 2 ≤ h j 2 1 ( ∫ x j x j + 1 ∫ x j x j + 1 ∣ f ′′ ( y ) ∣ d y d t ) 2 = ( ∫ x j x j + 1 ∣ f ′′ ( y ) ∣ d y ) 2 ≤ h j ∫ x j x j + 1 ∣ f ′′ ( y ) ∣ 2 d y by Cauchy-Schwarz inequality Integrating with respect to x x x
∫ x j x j + 1 ∣ f ′ ( x ) − p ′ ( x ) ∣ 2 d x ≤ h j 2 ∫ x j x j + 1 ∣ f ′ ′ ( y ) ∣ 2 d y \int_{x_j}^{x_{j+1}} |f'(x) - p'(x)|^2 \ud x \le h_j^2 \int_{x_j}^{x_{j+1}} |f''(y)|^2
\ud y ∫ x j x j + 1 ∣ f ′ ( x ) − p ′ ( x ) ∣ 2 d x ≤ h j 2 ∫ x j x j + 1 ∣ f ′′ ( y ) ∣ 2 d y Adding such inequalities from each interval, we obtain the second error estimate.