#%config InlineBackend.figure_format = 'svg'
from pylab import *
3.7.1 Newton as fixed point iteration ¶ The Newton-Raphson method to find the zeros of f ( x ) f(x) f ( x ) can be written as
x n + 1 = ϕ ( x n ) , ϕ ( x ) = x − f ( x ) f ′ ( x ) x_{n+1} = \phi(x_n), \qquad \phi(x) = x - \frac{f(x)}{f'(x)} x n + 1 = ϕ ( x n ) , ϕ ( x ) = x − f ′ ( x ) f ( x ) At a root α, we have
f ( α ) = 0 , ϕ ( α ) = α − f ( α ) f ′ ( α ) = α f(\alpha) = 0, \qquad \phi(\alpha) = \alpha - \frac{f(\alpha)}{f'(\alpha)} = \alpha f ( α ) = 0 , ϕ ( α ) = α − f ′ ( α ) f ( α ) = α Thus the root is a fixed point of the function ϕ. We have converted
the problem of root finding into one of finding the fixed points of a
map via the fixed point iteration x n + 1 = ϕ ( x n ) x_{n+1} = \phi(x_n) x n + 1 = ϕ ( x n ) .
To find a \sqrt{a} a for a > 0 a > 0 a > 0 , we can find the roots of f ( x ) = x 2 − a f(x) = x^2 - a f ( x ) = x 2 − a . We can rewrite this as a fixed point relation in many ways, e.g.,
x = x + c ( x 2 − a ) x = x + c(x^2 - a) x = x + c ( x 2 − a ) for some c ≠ 0 c \ne 0 c = 0
x = a x x = \frac{a}{x} x = x a
x = 1 2 ( x + a x ) x = \half(x + \frac{a}{x}) x = 2 1 ( x + x a ) , which is Newton method.
Not all these fixed point iterations may converge.
a = 3.0
c = 0.1
x1, x2, x3 = 2.0, 2.0, 2.0
print("%6d %18.10e %18.10e %18.10e" % (0,x1,x2,x3))
for i in range(10):
x1 = x1 + c*(x1**2 - a)
x2 = a/x2
x3 = 0.5*(x3 + a/x3)
print("%6d %18.10e %18.10e %18.10e" % (i+1,x1,x2,x3))
0 2.0000000000e+00 2.0000000000e+00 2.0000000000e+00
1 2.1000000000e+00 1.5000000000e+00 1.7500000000e+00
2 2.2410000000e+00 2.0000000000e+00 1.7321428571e+00
3 2.4432081000e+00 1.5000000000e+00 1.7320508100e+00
4 2.7401346820e+00 2.0000000000e+00 1.7320508076e+00
5 3.1909684895e+00 1.5000000000e+00 1.7320508076e+00
6 3.9091964797e+00 2.0000000000e+00 1.7320508076e+00
7 5.1373781913e+00 1.5000000000e+00 1.7320508076e+00
8 7.4766436594e+00 2.0000000000e+00 1.7320508076e+00
9 1.2766663700e+01 1.5000000000e+00 1.7320508076e+00
10 2.8765433904e+01 2.0000000000e+00 1.7320508076e+00
Only the third form, the Newton-Method, is converging to the root.
3.7.2 Fixed point, contraction map ¶ Let ϕ : [ a , b ] → [ a , b ] \phi : [a,b] \to [a,b] ϕ : [ a , b ] → [ a , b ] be continuous. Then x = ϕ ( x ) x = \phi(x) x = ϕ ( x ) has
atleast one solution in [ a , b ] [a,b] [ a , b ] .
Consider the continuous function
h ( x ) = ϕ ( x ) − x h(x) = \phi(x) - x h ( x ) = ϕ ( x ) − x for which
h ( a ) = ϕ ( a ) − a ≥ 0 , h ( b ) = ϕ ( b ) − b ≤ 0 h(a) = \phi(a) - a \ge 0, \qquad h(b) = \phi(b) - b \le 0 h ( a ) = ϕ ( a ) − a ≥ 0 , h ( b ) = ϕ ( b ) − b ≤ 0 By intermediate value theorem, h ( x ) h(x) h ( x ) must have atleast one root in
[ a , b ] [a,b] [ a , b ] , which is also the fixed point of ϕ ( x ) \phi(x) ϕ ( x ) .
Theorem 2 (Contraction map)
Let ϕ : [ a , b ] → [ a , b ] \phi : [a,b] \to [a,b] ϕ : [ a , b ] → [ a , b ] be continuous and for some λ ∈ ( 0 , 1 ) \lambda \in (0,1) λ ∈ ( 0 , 1 )
∣ ϕ ( x ) − ϕ ( y ) ∣ ≤ λ ∣ x − y ∣ , ∀ x , y ∈ [ a , b ] |\phi(x) - \phi(y)| \le \lambda |x-y|, \quad \forall x, y \in [a,b] ∣ ϕ ( x ) − ϕ ( y ) ∣ ≤ λ ∣ x − y ∣ , ∀ x , y ∈ [ a , b ] Then x = ϕ ( x ) x=\phi(x) x = ϕ ( x ) has a unique solution α ∈ [ a , b ] \alpha \in [a,b] α ∈ [ a , b ] . Also, the iterates
x n = ϕ ( x n − 1 ) , n ≥ 1 x_n = \phi(x_{n-1}), \qquad n \ge 1 x n = ϕ ( x n − 1 ) , n ≥ 1 will converge to α for any choice of x 0 ∈ [ a , b ] x_0 \in [a,b] x 0 ∈ [ a , b ] and
∣ α − x n ∣ ≤ λ n 1 − λ ∣ x 1 − x 0 ∣ |\alpha - x_n| \le \frac{\lambda^n}{1-\lambda} |x_1 - x_0| ∣ α − x n ∣ ≤ 1 − λ λ n ∣ x 1 − x 0 ∣ Uniqueness. Suppose ϕ has two fixed points α , β ∈ [ a , b ] \alpha, \beta \in [a,b] α , β ∈ [ a , b ] . Then
∣ α − β ∣ = ∣ ϕ ( α ) − ϕ ( β ) ∣ ≤ λ ∣ α − β ∣ |\alpha - \beta| = |\phi(\alpha) - \phi(\beta)| \le \lambda |\alpha - \beta| ∣ α − β ∣ = ∣ ϕ ( α ) − ϕ ( β ) ∣ ≤ λ ∣ α − β ∣ and hence
( 1 − λ ) ∣ α − β ∣ ≤ 0 (1-\lambda) |\alpha - \beta| \le 0 ( 1 − λ ) ∣ α − β ∣ ≤ 0 Since 0 < λ < 1 0 < \lambda < 1 0 < λ < 1 , we conclude that only equality is possible, which
implies that α = β \alpha =\beta α = β .
Convergence. Since ϕ ( [ a , b ] ) ⊂ [ a , b ] \phi([a,b]) \subset [a,b] ϕ ([ a , b ]) ⊂ [ a , b ] , the
sequence generated by x n = ϕ ( x n − 1 ) x_n = \phi(x_{n-1}) x n = ϕ ( x n − 1 ) is contained in [ a , b ] [a,b] [ a , b ] .
∣ α − x n + 1 ∣ = ∣ ϕ ( α ) − ϕ ( x n ) ∣ ≤ λ ∣ α − x n ∣ |\alpha - x_{n+1}| = |\phi(\alpha) - \phi(x_n)| \le \lambda |\alpha - x_n| ∣ α − x n + 1 ∣ = ∣ ϕ ( α ) − ϕ ( x n ) ∣ ≤ λ ∣ α − x n ∣ and by induction
∣ α − x n ∣ ≤ λ n ∣ α − x 0 ∣ |\alpha - x_n| \le \lambda^n |\alpha - x_0| ∣ α − x n ∣ ≤ λ n ∣ α − x 0 ∣ As n → ∞ n \to \infty n → ∞ , λ n → 0 \lambda^n \to 0 λ n → 0 so that x n → α x_n \to \alpha x n → α .
Error bound. Using contraction property and triangle inequality, we have
∣ α − x 0 ∣ ≤ ∣ α − x 1 ∣ + ∣ x 1 − x 0 ∣ ≤ λ ∣ α − x 0 ∣ + ∣ x 1 − x 0 ∣ |\alpha - x_0| \le |\alpha - x_1| + |x_1 - x_0| \le \lambda |\alpha - x_0| + |x_1 - x_0| ∣ α − x 0 ∣ ≤ ∣ α − x 1 ∣ + ∣ x 1 − x 0 ∣ ≤ λ ∣ α − x 0 ∣ + ∣ x 1 − x 0 ∣ and hence
∣ α − x 0 ∣ ≤ 1 1 − λ ∣ x 1 − x 0 ∣ |\alpha - x_0| \le \frac{1}{1-\lambda} |x_1 - x_0| ∣ α − x 0 ∣ ≤ 1 − λ 1 ∣ x 1 − x 0 ∣ Combining this with (11) , we get
∣ α − x n ∣ ≤ λ n ∣ α − x 0 ∣ ≤ λ n 1 − λ ∣ x 1 − x 0 ∣ |\alpha - x_n| \le \lambda^n |\alpha - x_0| \le \frac{\lambda^n}{1-\lambda} |x_1 - x_0| ∣ α − x n ∣ ≤ λ n ∣ α − x 0 ∣ ≤ 1 − λ λ n ∣ x 1 − x 0 ∣ We can derive an error estimate as follows. Similar to (13) , we can show that
∣ α − x n ∣ ≤ 1 1 − λ ∣ x n + 1 − x n ∣ |\alpha - x_n| \le \frac{1}{1-\lambda} |x_{n+1} - x_n| ∣ α − x n ∣ ≤ 1 − λ 1 ∣ x n + 1 − x n ∣ and hence
∣ α − x n + 1 ∣ ≤ λ ∣ α − x n ∣ ≤ λ 1 − λ ∣ x n + 1 − x n ∣ |\alpha - x_{n+1}| \le \lambda |\alpha - x_n| \le \frac{\lambda}{1-\lambda} |x_{n+1} - x_n| ∣ α − x n + 1 ∣ ≤ λ ∣ α − x n ∣ ≤ 1 − λ λ ∣ x n + 1 − x n ∣ which is a more sharper error bound since λ < 1 \lambda < 1 λ < 1 . If
λ 1 − λ ∣ x n + 1 − x n ∣ ≤ ϵ ⟹ ∣ α − x n + 1 ∣ ≤ ϵ \frac{\lambda}{1-\lambda} |x_{n+1} - x_n| \le \epsilon \limplies
|\alpha - x_{n+1}| \le \epsilon 1 − λ λ ∣ x n + 1 − x n ∣ ≤ ϵ ⟹ ∣ α − x n + 1 ∣ ≤ ϵ 3.7.3 Differentiability and contractivity ¶ Assume that ϕ : [ a , b ] → [ a , b ] \phi : [a,b] \to [a,b] ϕ : [ a , b ] → [ a , b ] is continuously differentiable and
λ = max a ≤ x ≤ b ∣ ϕ ′ ( x ) ∣ < 1 \lambda = \max_{a \le x \le b} |\phi'(x)| < 1 λ = a ≤ x ≤ b max ∣ ϕ ′ ( x ) ∣ < 1 Then
ϕ has a unique fixed point α ∈ [ a , b ] \alpha \in [a,b] α ∈ [ a , b ] .
For any x 0 ∈ [ a , b ] x_0 \in [a,b] x 0 ∈ [ a , b ] , the iterations x n + 1 = ϕ ( x n ) x_{n+1} = \phi(x_{n}) x n + 1 = ϕ ( x n )
converge to α.
∣ α − x n ∣ ≤ λ n ∣ α − x 0 ∣ ≤ λ n 1 − λ ∣ x 1 − x 0 ∣ |\alpha - x_n| \le \lambda^n |\alpha - x_0| \le \frac{\lambda^n}{1-\lambda} |
x_1 - x_0| ∣ α − x n ∣ ≤ λ n ∣ α − x 0 ∣ ≤ 1 − λ λ n ∣ x 1 − x 0 ∣ and
lim n → ∞ α − x n + 1 α − x n = ϕ ′ ( α ) \lim_{n \to \infty} \frac{\alpha - x_{n+1}}{\alpha - x_n} = \phi'(\alpha) n → ∞ lim α − x n α − x n + 1 = ϕ ′ ( α ) (1,2) For some ξ between x x x , y y y , we have
ϕ ( x ) − ϕ ( y ) = ϕ ′ ( ξ ) ( x − y ) \phi(x) - \phi(y) = \phi'(\xi) (x-y) ϕ ( x ) − ϕ ( y ) = ϕ ′ ( ξ ) ( x − y ) and hence
∣ ϕ ( x ) − ϕ ( y ) ∣ = ∣ ϕ ′ ( ξ ) ∣ ∣ x − y ∣ ≤ λ ∣ x − y ∣ |\phi(x) - \phi(y)| = |\phi'(\xi)| |x-y| \le \lambda |x-y| ∣ ϕ ( x ) − ϕ ( y ) ∣ = ∣ ϕ ′ ( ξ ) ∣∣ x − y ∣ ≤ λ ∣ x − y ∣ so that ϕ is a contraction map and by previous theorem, (1) and (2) follow.
(3) Now
α − x n + 1 = ϕ ( α ) − ϕ ( x n ) = ϕ ′ ( ξ n ) ( α − x n ) , ξ n between α , x n \alpha - x_{n+1} = \phi(\alpha) - \phi(x_n) = \phi'(\xi_n) (\alpha - x_n), \qquad \xi_n \textrm{ between } \alpha, x_n α − x n + 1 = ϕ ( α ) − ϕ ( x n ) = ϕ ′ ( ξ n ) ( α − x n ) , ξ n between α , x n and ξ n → α \xi_n \to \alpha ξ n → α so that
lim n → ∞ α − x n + 1 α − x n = lim n → ∞ ϕ ′ ( ξ n ) = ϕ ′ ( α ) \lim_{n \to \infty} \frac{\alpha - x_{n+1}}{\alpha - x_n} = \lim_{n \to \infty}
\phi'(\xi_n) = \phi'(\alpha) n → ∞ lim α − x n α − x n + 1 = n → ∞ lim ϕ ′ ( ξ n ) = ϕ ′ ( α ) If ϕ ′ ( α ) ≠ 0 \phi'(\alpha) \ne 0 ϕ ′ ( α ) = 0 , then the sequence { x n } \{ x_n \} { x n } converges to α with order p = 1 p=1 p = 1 , i.e., we have linear convergence.
Pick a number λ satisfying
∣ ϕ ′ ( α ) ∣ < λ < 1 |\phi'(\alpha)| < \lambda < 1 ∣ ϕ ′ ( α ) ∣ < λ < 1 Then pick an interval I = [ α − δ , α + δ ] I = [\alpha - \delta, \alpha + \delta] I = [ α − δ , α + δ ] such that
max x ∈ I ∣ ϕ ′ ( x ) ∣ ≤ λ < 1 \max_{x \in I} |\phi'(x)| \le \lambda < 1 x ∈ I max ∣ ϕ ′ ( x ) ∣ ≤ λ < 1 Now for any x ∈ I x \in I x ∈ I , ∣ α − x ∣ ≤ δ |\alpha - x| \le \delta ∣ α − x ∣ ≤ δ and
∣ α − ϕ ( x ) ∣ = ∣ ϕ ( α ) − ϕ ( x ) ∣ = ∣ ϕ ′ ( ξ ) ∣ ⋅ ∣ α − x ∣ |\alpha - \phi(x)| = |\phi(\alpha) - \phi(x)| = |\phi'(\xi)| \cdot |\alpha - x| ∣ α − ϕ ( x ) ∣ = ∣ ϕ ( α ) − ϕ ( x ) ∣ = ∣ ϕ ′ ( ξ ) ∣ ⋅ ∣ α − x ∣ where ξ is between α , x \alpha, x α , x and hence in I I I , so that
∣ α − ϕ ( x ) ∣ ≤ λ ∣ α − x ∣ < ∣ α − x ∣ ≤ δ |\alpha - \phi(x)| \le \lambda |\alpha - x| < |\alpha - x| \le \delta ∣ α − ϕ ( x ) ∣ ≤ λ ∣ α − x ∣ < ∣ α − x ∣ ≤ δ Hence we have ϕ ( I ) ⊂ I \phi(I) \subset I ϕ ( I ) ⊂ I . Now apply previous theorem using [ a , b ] = [ α − δ , α + δ ] [a,b] = [\alpha- \delta,\alpha+\delta] [ a , b ] = [ α − δ , α + δ ] .
To find the root of f ( x ) = x 2 − a f(x)=x^2-a f ( x ) = x 2 − a , we have tried the following
x n + 1 = ϕ ( x n ) , ϕ ( x ) = x + c ( x 2 − a ) , c = 0.1 x_{n+1} = \phi(x_n), \qquad \phi(x) = x + c(x^2 -a), \qquad c = 0.1 x n + 1 = ϕ ( x n ) , ϕ ( x ) = x + c ( x 2 − a ) , c = 0.1 which did not converge. We need
∣ ϕ ′ ( a ) ∣ = ∣ 1 + 2 c a ∣ < 1 ⟹ − 1 a < c < 0 |\phi'(\sqrt{a})| = |1 + 2 c \sqrt{a}| < 1 \Longrightarrow - \frac{1}{\sqrt{a}} < c < 0 ∣ ϕ ′ ( a ) ∣ = ∣1 + 2 c a ∣ < 1 ⟹ − a 1 < c < 0 For a = 3 a=3 a = 3 , the good values are c ∈ ( − 0.578 , 0.0 ) c \in (-0.578, 0.0) c ∈ ( − 0.578 , 0.0 ) , so try with
c = − 0.1 c=-0.1 c = − 0.1 for example.
3.7.4 Order of convergence ¶ If ϕ ′ ( α ) ≠ 0 \phi'(\alpha) \ne 0 ϕ ′ ( α ) = 0 , then we only get linear convergence. For faster convergence, derivatives of ϕ need to vanish at the root.
Let α be a fixed point of ϕ and let ϕ be p p p times continuously differentiable around α, for some p ≥ 2 p \ge 2 p ≥ 2 . Further, assume
ϕ ′ ( α ) = ϕ ′ ′ ( α ) = … = ϕ ( p − 1 ) ( α ) = 0 ϕ ( p ) ( α ) ≠ 0 \begin{gather}
\phi'(\alpha) = \phi''(\alpha) = \ldots = \phi^{(p-1)}(\alpha) = 0 \\
\phi^{(p)}(\alpha) \ne 0
\end{gather} ϕ ′ ( α ) = ϕ ′′ ( α ) = … = ϕ ( p − 1 ) ( α ) = 0 ϕ ( p ) ( α ) = 0 Then if the initial guess x 0 x_0 x 0 is sufficiently close to α, the iteration x n + 1 = ϕ ( x n ) x_{n+1} = \phi(x_n) x n + 1 = ϕ ( x n ) will have order of convergence p p p and
lim n → ∞ α − x n + 1 ( α − x n ) p = ( − 1 ) p − 1 ϕ ( p ) ( α ) p ! \lim_{n \to \infty} \frac{\alpha - x_{n+1}}{(\alpha - x_n)^p} = (-1)^{p-1}
\frac{\phi^{(p)}(\alpha)}{p!} n → ∞ lim ( α − x n ) p α − x n + 1 = ( − 1 ) p − 1 p ! ϕ ( p ) ( α ) Since ϕ ′ ( α ) = 0 \phi'(\alpha) = 0 ϕ ′ ( α ) = 0 , we can choose an interval I = [ α − δ , α + δ ] I = [\alpha-\delta,\alpha+\delta] I = [ α − δ , α + δ ] in which ∣ ϕ ′ ( x ) ∣ < 1 |\phi'(x)| < 1 ∣ ϕ ′ ( x ) ∣ < 1 . Using previous theorem, we know the iterations converge for any initial guess x 0 ∈ I x_0 \in I x 0 ∈ I .
Now by Taylor expansion around α
x n + 1 = ϕ ( x n ) = ϕ ( α ) + ( x n − α ) ϕ ′ ( α ) + … + ( x n − α ) p − 1 ( p − 1 ) ! ϕ ( p − 1 ) ( α ) + ( x n − α ) p p ! ϕ ( p ) ( ξ n ) , ξ n between x n and α \begin{aligned}
x_{n+1}
&= \phi(x_n) \\
&= \phi(\alpha) + (x_n-\alpha) \phi'(\alpha) + \ldots + \frac{(x_n-\alpha)^{p-1}} {(p-1)!} \phi^{(p-1)}(\alpha) \\
& \quad + \frac{(x_n-\alpha)^p}{p!} \phi^{(p)}(\xi_n), \qquad \textrm{$\xi_n$ between $x_n$ and $\alpha$}
\end{aligned} x n + 1 = ϕ ( x n ) = ϕ ( α ) + ( x n − α ) ϕ ′ ( α ) + … + ( p − 1 )! ( x n − α ) p − 1 ϕ ( p − 1 ) ( α ) + p ! ( x n − α ) p ϕ ( p ) ( ξ n ) , ξ n between x n and α Using the given conditions on ϕ
x n + 1 = α + ( x n − α ) p p ! ϕ ( p ) ( ξ n ) x_{n+1} = \alpha + \frac{(x_n-\alpha)^p}{p!} \phi^{(p)}(\xi_n) x n + 1 = α + p ! ( x n − α ) p ϕ ( p ) ( ξ n ) and hence we have p p p ’th order convergence
∣ x n + 1 − α ∣ ≤ M ∣ x n − α ∣ p , M = 1 p ! max ∣ ϕ ( p ) ∣ |x_{n+1} - \alpha| \le M |x_n-\alpha|^p, \qquad M = \frac{1}{p!} \max |\phi^{(p)}| ∣ x n + 1 − α ∣ ≤ M ∣ x n − α ∣ p , M = p ! 1 max ∣ ϕ ( p ) ∣ The last result also follows easily since ξ n → α \xi_n \to \alpha ξ n → α .
Example 3 (Newton-Raphson method)
The iteration function for Newton-Raphson method is
ϕ ( x ) = x − f ( x ) f ′ ( x ) \phi(x) = x - \frac{f(x)}{f'(x)} ϕ ( x ) = x − f ′ ( x ) f ( x ) for which
ϕ ′ ( x ) = f ( x ) f ′ ′ ( x ) [ f ′ ( x ) ] 2 , ϕ ′ ′ ( x ) = f ′ ′ ( x ) f ′ ( x ) + f ( x ) f ′ ′ ′ ( x ) [ f ′ ( x ) ] 2 − 2 f ( x ) [ f ′ ′ ( x ) ] 2 [ f ′ ( x ) ] 3 \phi'(x) = \frac{f(x) f''(x)}{[f'(x)]^2}, \qquad
\phi''(x) = \frac{f''(x)}{f'(x)} + \frac{f(x) f'''(x)}{[f'(x)]^2} - \frac{2 f(x) [f''(x)]^2}{[f'(x)]^3} ϕ ′ ( x ) = [ f ′ ( x ) ] 2 f ( x ) f ′′ ( x ) , ϕ ′′ ( x ) = f ′ ( x ) f ′′ ( x ) + [ f ′ ( x ) ] 2 f ( x ) f ′′′ ( x ) − [ f ′ ( x ) ] 3 2 f ( x ) [ f ′′ ( x ) ] 2 If f ′ ( α ) ≠ 0 f'(\alpha) \ne 0 f ′ ( α ) = 0 , then
ϕ ′ ( α ) = 0 , ϕ ′ ′ ( α ) = f ′ ′ ( α ) f ′ ( α ) \phi'(\alpha) = 0, \qquad \phi''(\alpha) = \frac{f''(\alpha)}{f'(\alpha)} ϕ ′ ( α ) = 0 , ϕ ′′ ( α ) = f ′ ( α ) f ′′ ( α ) and hence Newton method converges with order atleast p = 2 p=2 p = 2 .
Consider the function
f ( x ) = ( x − 1 ) 2 sin ( x ) f(x) = (x-1)^2 \sin(x) f ( x ) = ( x − 1 ) 2 sin ( x ) for which x = 1 x=1 x = 1 is a double root.
def f(x):
return (x-1.0)**2 * sin(x)
def df(x):
return 2.0*(x-1.0)*sin(x) + (x-1.0)**2 * cos(x)
x = linspace(0.0,2.0,100)
plot(x,f(x)), xlabel('x'), ylabel('f(x)'), grid(True);
Here is the Newton method
def newton(x0,m=1.0):
n = 50
x = zeros(50)
x[0] = x0
print("%6d %24.14e" % (0,x[0]))
for i in range(1,50):
x[i] = x[i-1] - m*f(x[i-1])/df(x[i-1])
if i > 1:
r = (x[i] - x[i-1])/(x[i-1]-x[i-2])
else:
r = 0.0
print("%6d %24.14e %14.6e" % (i,x[i],r))
if abs(f(x[i])) < 1.0e-14:
break
The Newton method gives
0 2.00000000000000e+00
1 1.35163555744248e+00 0.000000e+00
2 1.18244356861394e+00 2.609520e-01
3 1.09450383817604e+00 5.197630e-01
4 1.04837639635805e+00 5.245347e-01
5 1.02452044172239e+00 5.171749e-01
6 1.01235093391487e+00 5.101245e-01
7 1.00619920246051e+00 5.055037e-01
8 1.00310567444820e+00 5.028711e-01
9 1.00155437342780e+00 5.014666e-01
10 1.00077757303341e+00 5.007412e-01
11 1.00038888338214e+00 5.003726e-01
12 1.00019446594325e+00 5.001868e-01
13 1.00009723903915e+00 5.000935e-01
14 1.00004862103702e+00 5.000468e-01
15 1.00002431089794e+00 5.000234e-01
16 1.00001215554384e+00 5.000117e-01
17 1.00000607779564e+00 5.000059e-01
18 1.00000303890375e+00 5.000029e-01
19 1.00000151945336e+00 5.000015e-01
20 1.00000075972705e+00 5.000007e-01
21 1.00000037986362e+00 5.000004e-01
22 1.00000018993183e+00 5.000002e-01
23 1.00000009496592e+00 5.000001e-01
which shows convergence towards x = 1 x=1 x = 1 . Note that
ϕ ′ ( 1 ) = f ( 1 ) f ′ ′ ( 1 ) [ f ′ ( 1 ) ] 2 = 1 2 < 1 \phi'(1) = \frac{f(1) f''(1)}{[f'(1)]^2} = \half < 1 ϕ ′ ( 1 ) = [ f ′ ( 1 ) ] 2 f ( 1 ) f ′′ ( 1 ) = 2 1 < 1 Consistent with this, we observe that Newton method is converging but only linearly; the last column above shows that
∣ x n + 1 − x n ∣ ∣ x n − x n − 1 ∣ ≈ 1 2 = ϕ ′ ( 1 ) , n → ∞ \frac{|x_{n+1} - x_n|}{|x_n - x_{n-1}|} \approx \half = \phi'(1), \qquad n \to \infty ∣ x n − x n − 1 ∣ ∣ x n + 1 − x n ∣ ≈ 2 1 = ϕ ′ ( 1 ) , n → ∞ 3.7.5 Multiple roots and Newton method ¶ Let α be a root of f ( x ) f(x) f ( x ) with multiplicity m m m , so that
f ( α ) = f ′ ( α ) = … = f ( m − 1 ) ( α ) = 0 , f ( m ) ( α ) ≠ 0 f(\alpha) = f'(\alpha) = \ldots = f^{(m-1)}(\alpha) = 0, \qquad f^{(m)}(\alpha) \ne 0 f ( α ) = f ′ ( α ) = … = f ( m − 1 ) ( α ) = 0 , f ( m ) ( α ) = 0 A Taylor expansion around α gives
f ( x ) = ( x − α ) m m ! f ( m ) ( ξ x ) , ξ x between α and x f(x) = \frac{(x-\alpha)^m}{m!} f^{(m)}(\xi_x), \qquad \textrm{$\xi_x$ between $
\alpha$ and $x$} f ( x ) = m ! ( x − α ) m f ( m ) ( ξ x ) , ξ x between α and x Near α, f ( x ) f(x) f ( x ) behaves like
f ( x ) = ( x − α ) m h ( x ) , h ( α ) ≠ 0 f(x) = (x-\alpha)^m h(x), \qquad h(\alpha) \ne 0 f ( x ) = ( x − α ) m h ( x ) , h ( α ) = 0 Let us apply Newton method to this function. Since
f ′ ( x ) = ( x − α ) m h ′ ( x ) + m ( x − α ) m − 1 h ( x ) f'(x) = (x-\alpha)^m h'(x) + m(x-\alpha)^{m-1} h(x) f ′ ( x ) = ( x − α ) m h ′ ( x ) + m ( x − α ) m − 1 h ( x ) so that the iteration function of Newton method is
ϕ ( x ) = x − f ( x ) f ′ ( x ) = x − ( x − α ) h ( x ) m h ( x ) + ( x − α ) h ′ ( x ) \phi(x) = x - \frac{f(x)}{f'(x)} = x - \frac{(x-\alpha) h(x)}{mh(x) + (x-\alpha)h'(x)} ϕ ( x ) = x − f ′ ( x ) f ( x ) = x − mh ( x ) + ( x − α ) h ′ ( x ) ( x − α ) h ( x ) This satisfies
ϕ ′ ( α ) = 1 − 1 m ≠ 0 if m ≥ 2 \phi'(\alpha) = 1 - \frac{1}{m} \ne 0 \qquad \textrm{if} \quad m \ge 2 ϕ ′ ( α ) = 1 − m 1 = 0 if m ≥ 2 Thus Newton method converges since ∣ ϕ ′ ( α ) ∣ < 1 |\phi'(\alpha)| < 1 ∣ ϕ ′ ( α ) ∣ < 1 , but only linearly, with rate of convergence c = m − 1 m c = \frac{m-1}{m} c = m m − 1 .
To improve Newton method for multiple roots, we need an iteration function with ϕ ′ ( α ) = 0 \phi'(\alpha) = 0 ϕ ′ ( α ) = 0 . This is satisfied for
ϕ ( x ) = x − m f ( x ) f ′ ( x ) \phi(x) = x - m \frac{f(x)}{f'(x)} ϕ ( x ) = x − m f ′ ( x ) f ( x ) and
lim n → ∞ α − x n + 1 ( α − x n ) 2 = − 1 2 ϕ ′ ′ ( α ) \lim_{n \to \infty} \frac{\alpha - x_{n+1}}{(\alpha - x_n)^2} = - \half \phi''(\alpha) n → ∞ lim ( α − x n ) 2 α − x n + 1 = − 2 1 ϕ ′′ ( α ) which recovers the quadratic convergence of Newton method.
We repeat previous example but with m = 2 m=2 m = 2 since we have a double root
0 2.00000000000000e+00
1 7.03271114884954e-01 0.000000e+00
2 1.06293359720705e+00 -2.773614e-01
3 1.00108318855754e+00 -1.719679e-01
4 1.00000037565568e+00 1.750696e-02
5 1.00000000000005e+00 3.469257e-04
Now we recover quadratic convergence which is clearly faster than linear convergence !!! Compare the number of correct decimal places in the root between the two methods.
The view point of fixed point iterations helped us to understand why Newton converges only linearly to multiple roots, and also helped to fix it so that quadratic convergence is recovered.
Note that f ′ ( x n ) → 0 f'(x_n) \to 0 f ′ ( x n ) → 0 as x n → α x_n \to \alpha x n → α , a multiple root. We have to be careful in applying Newton method since we divide by f ′ ( x n ) f'(x_n) f ′ ( x n ) . In the unlikely case that the denominator becomes exactly zero, it may generate inf or NaN.