#%config InlineBackend.figure_format = 'svg'
from pylab import *
import sympy
3.4.1 The method ¶ The Newton-Raphson method is an iterative method to find a root of a function. It starts with an initial guess for the root and tries to improve it. Given an initial guess x 0 x_0 x 0 for the root, we have
f ( x 0 ) ≠ 0 f(x_0) \ne 0 f ( x 0 ) = 0 Let us try to find a correction Δ x 0 \Delta x_0 Δ x 0 which brings it closer to the root
f ( x 0 + Δ x 0 ) = 0 f(x_0 + \Delta x_0) = 0 f ( x 0 + Δ x 0 ) = 0 Using Taylor formula, and truncating it at the linear term
f ( x 0 ) + Δ x 0 f ′ ( x 0 ) ≈ 0 ⟹ Δ x 0 ≈ − f ( x 0 ) f ′ ( x 0 ) f(x_0) + \Delta x_0 f'(x_0) \approx 0 \qquad \Longrightarrow \qquad \Delta x_0 \approx -
\frac{f(x_0)}{f'(x_0)} f ( x 0 ) + Δ x 0 f ′ ( x 0 ) ≈ 0 ⟹ Δ x 0 ≈ − f ′ ( x 0 ) f ( x 0 ) We obtain a new estimate for the root
x 1 = x 0 + Δ x 0 = x 0 − f ( x 0 ) f ′ ( x 0 ) x_1 = x_0 + \Delta x_0 = x_0 - \frac{f(x_0)}{f'(x_0)} x 1 = x 0 + Δ x 0 = x 0 − f ′ ( x 0 ) f ( x 0 ) Geometrically, we are approximating the function by the tangent line at x 0 x_0 x 0 which is given by
y = y ( x ) = f ( x 0 ) + ( x − x 0 ) f ′ ( x 0 ) y = y(x) = f(x_0) + (x-x_0) f'(x_0) y = y ( x ) = f ( x 0 ) + ( x − x 0 ) f ′ ( x 0 ) and finding the zero of the tangent line,
y ( x 1 ) = 0 ⟹ x 1 = x 0 − f ( x 0 ) f ′ ( x 0 ) y(x_1) = 0 \limplies x_1 = x_0 - \frac{f(x_0)}{f'(x_0)} y ( x 1 ) = 0 ⟹ x 1 = x 0 − f ′ ( x 0 ) f ( x 0 ) see figure. We now repeat the same process for x 1 , x 2 , … x_1, x_2, \ldots x 1 , x 2 , … until some convergence is achieved.
This algorithm requires that f f f is differentiable and f ′ f' f ′ is not zero in a neighbourhood of the root. The iterations may not converge in a finite number of steps, and we need to choose some stopping criteria. We can use a check on the function value and another on the root; we stop when
∣ f ( x n + 1 ) ∣ < δ and/or ∣ x n + 1 − x n ∣ < ∣ x n + 1 ∣ ϵ |f(x_{n+1})| < \delta \qquad \textrm{and/or} \qquad |x_{n+1} - x_n| < |x_{n+1}| \epsilon ∣ f ( x n + 1 ) ∣ < δ and/or ∣ x n + 1 − x n ∣ < ∣ x n + 1 ∣ ϵ returns true. The motivation for the second criterion is explained below and it is a test for the error in the location of the root.
The algorithm for Newton-Raphson method is given below.
Let us apply this method on some functions.
Example 1 (Affine function)
Consider an affine function f ( x ) = a x + b f(x) = ax + b f ( x ) = a x + b . Let x 0 x_0 x 0 be any initial
guess. The Newton iterations are yield
x 1 = x 0 − a x 0 + b a = − b a x_1 = x_0 - \frac{ax_0 + b}{a} = -\frac{b}{a} x 1 = x 0 − a a x 0 + b = − a b so that we converge to the root in one iteration.
Find the root of
f ( x ) = exp ( x ) − 3 2 − arctan x f(x) = \exp(x) - \frac{3}{2} - \arctan{x} f ( x ) = exp ( x ) − 2 3 − arctan x which is in the interval [ 1 2 , 1 ] [\shalf,1] [ 2 1 , 1 ] .
We define the function using sympy
and automatically compute its derivative. Then we make functions out of these and plot them.
x = sympy.Symbol('x')
# Define the function
f = sympy.exp(x) - 3.0/2.0 - sympy.atan(x)
# Compute its derivative
df = sympy.diff(f,x)
print("f = ", f)
print("df= ", df)
# Make functions
F = sympy.lambdify(x, f, modules=['numpy'])
DF = sympy.lambdify(x, df, modules=['numpy'])
# Plot the functions
X = linspace(-1,1,100)
figure(figsize=(9,4))
subplot(1,2,1), plot(X,F(X)), grid(True), title('Function')
subplot(1,2,2), plot(X,DF(X)), grid(True), title('Derivative');
f = exp(x) - atan(x) - 1.5
df= exp(x) - 1/(x**2 + 1)
Now we implement the Newton method. We have to specify the maximum number of iterations, an initial guess and a tolerance to decide when to stop.
M = 20 # maximum iterations
x = 0.5 # initial guess
eps = 1e-14 # relative tolerance on root
delta = 1e-16 # tolerance on f
f = F(x)
for i in range(M):
df = DF(x)
dx = -f/df
x = x + dx
e = abs(dx)
f = F(x)
print("%6d %22.14e %22.14e %22.14e" % (i+1,x,e,abs(f)))
if e < eps * abs(x) or abs(f) < delta:
break
1 8.71059792151655e-01 3.71059792151655e-01 1.72847808769943e-01
2 7.76133043351671e-01 9.49267487999836e-02 1.30353403123729e-02
3 7.67717620302437e-01 8.41542304923425e-03 9.81774270505387e-05
4 7.67653269950858e-01 6.43503515784806e-05 5.71995606435394e-09
5 7.67653266201279e-01 3.74957960683442e-09 2.22044604925031e-16
6 7.67653266201279e-01 1.45556000556775e-16 0.00000000000000e+00
Example 3 (Absolute and relative tolerance)
We consider the same function as in a previous example, but we have shifted it so that the root lies at a large distance from the origin,
f ( x ) = exp ( x − x 0 ) − 3 2 − arctan ( x − x 0 ) f(x) = \exp(x-x_0) - \frac{3}{2} - \arctan(x-x_0) f ( x ) = exp ( x − x 0 ) − 2 3 − arctan ( x − x 0 ) where x 0 x_0 x 0 is some large number, e.g., x 0 = 1 0 10 x_0 = 10^{10} x 0 = 1 0 10 . The root lies around x 0 x_0 x 0 and due to finite precision of floating point numbers, we cannot expect to compute it to good absolute tolerance, but only to relative tolerance.
Define the function and its derivative.
x0 = 1e10
# Define the function
F = lambda x: exp(x-x0)-3.0/2.0-atan(x-x0)
# Compute its derivative
DF = lambda x: exp(x-x0) - 1.0/(1 + (x-x0)**2)
Now we implement the Newton method. We have to specify the maximum number of iterations, an initial guess and a tolerance to decide when to stop.
M = 50 # maximum iterations
eps = 1e-14 # relative tolerance on root
Using relative tolerance on the root
x = x0 + 0.5 # initial guess
f = F(x)
for i in range(M):
df = DF(x)
dx = -f/df
x = x + dx
e = abs(dx)
f = F(x)
print("%6d %22.14e %22.14e %22.14e" % (i,x,e,abs(f)))
if e < eps * abs(x):
break
0 1.00000000008711e+10 3.71059792151655e-01 1.72847126992936e-01
1 1.00000000007761e+10 9.49264320087584e-02 1.30346281936581e-02
2 1.00000000007677e+10 8.41497025309343e-03 9.77825039385483e-05
3 1.00000000007677e+10 6.40915294389154e-05 1.15114296217467e-06
The root converges in relative sense to machine zero in few iterations, and the function value is O ( 1 0 − 6 ) O(10^{-6}) O ( 1 0 − 6 ) .
Using absolute tolerance on the root
x = x0 + 0.5 # initial guess
f = F(x)
for i in range(M):
df = DF(x)
dx = -f/df
x = x + dx
e = abs(dx)
f = F(x)
print("%6d %22.14e %22.14e %22.14e" % (i,x,e,abs(f)))
if e < eps:
break
0 1.00000000008711e+10 3.71059792151655e-01 1.72847126992936e-01
1 1.00000000007761e+10 9.49264320087584e-02 1.30346281936581e-02
2 1.00000000007677e+10 8.41497025309343e-03 9.77825039385483e-05
3 1.00000000007677e+10 6.40915294389154e-05 1.15114296217467e-06
4 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
5 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
6 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
7 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
8 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
9 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
10 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
11 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
12 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
13 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
14 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
15 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
16 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
17 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
18 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
19 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
20 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
21 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
22 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
23 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
24 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
25 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
26 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
27 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
28 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
29 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
30 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
31 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
32 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
33 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
34 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
35 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
36 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
37 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
38 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
39 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
40 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
41 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
42 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
43 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
44 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
45 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
46 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
47 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
48 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
49 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
There is no convergence. The root cannot be computed to good absolute accuracy, and consequently, we cannot make f ( x ) f(x) f ( x ) sufficiently small. We could reduce it to about 10-6 which is also achieved using relative tolerance as stopping criterion.
3.4.2 Convergence analysis ¶ Let r r r be a root of f f f . From Taylor formula with remainder term
0 = f ( r ) = f ( x n ) + ( r − x n ) f ′ ( x n ) + 1 2 ( r − x n ) 2 f ′ ′ ( ξ n ) , ξ n between r and x n 0 = f(r) = f(x_n) + (r - x_n) f'(x_n) + \half (r - x_n)^2 f''(\xi_n), \qquad \xi_n \mbox{ between $r$ and $x_n$} 0 = f ( r ) = f ( x n ) + ( r − x n ) f ′ ( x n ) + 2 1 ( r − x n ) 2 f ′′ ( ξ n ) , ξ n between r and x n Solve for r r r from the linear term
r = x n − f ( x n ) f ′ ( x n ) ⏟ x n + 1 − 1 2 ( r − x n ) 2 f ′ ′ ( ξ n ) f ′ ( x n ) r = \underbrace{x_n - \frac{f(x_n)}{f'(x_n)}}_{x_{n+1}} - \half (r - x_n)^2
\frac{f''(\xi_n)}{f'(x_n)} r = x n + 1 x n − f ′ ( x n ) f ( x n ) − 2 1 ( r − x n ) 2 f ′ ( x n ) f ′′ ( ξ n ) The error in the root is
r − x n + 1 = − ( r − x n ) 2 f ′ ′ ( ξ n ) 2 f ′ ( x n ) r - x_{n+1} = - (r - x_n)^2 \frac{f''(\xi_n)}{2 f'(x_n)} r − x n + 1 = − ( r − x n ) 2 2 f ′ ( x n ) f ′′ ( ξ n ) Atleast in a neighborhood of the root, we will assume that
∣ f ′ ′ ( ξ n ) 2 f ′ ( x n ) ∣ ≤ M \left| \frac{f''(\xi_n)}{2 f'(x_n)} \right| \le M ∣ ∣ 2 f ′ ( x n ) f ′′ ( ξ n ) ∣ ∣ ≤ M Then the error e n = ∣ r − x n ∣ e_n = |r - x_n| e n = ∣ r − x n ∣ satisfies
e n + 1 ≤ M e n 2 e_{n+1} \le M e_n^2 e n + 1 ≤ M e n 2 To show convergence x n → r x_n \to r x n → r , we have to show that e n → 0 e_n \to 0 e n → 0 .
Theorem 1 (Convergence of Newton method)
Assume that f , f ′ , f ′ ′ f,f',f'' f , f ′ , f ′′ are continuous in some neighbourhood of the root r r r , and f ′ ( r ) ≠ 0 f'(r) \ne 0 f ′ ( r ) = 0 . If x 0 x_0 x 0 is chosen sufficiently close to r r r , then
lim n → r x n = r \lim_{n \to r} x_n = r lim n → r x n = r lim n → ∞ r − x n + 1 ( r − x n ) 2 = − f ′ ′ ( r ) 2 f ′ ( r ) \lim_{n \to \infty} \frac{r - x_{n+1}}{(r - x_n)^2} = - \frac{f''(r)}{2f'(r)} lim n → ∞ ( r − x n ) 2 r − x n + 1 = − 2 f ′ ( r ) f ′′ ( r ) (1) Pick a sufficiently small interval I = [ r − δ , r + δ ] I = [r-\delta,r+\delta] I = [ r − δ , r + δ ] on which f ′ ≠ 0 f' \ne 0 f ′ = 0 . This is possible to do since f ′ f' f ′ is continuous and f ′ ( r ) ≠ 0 f'(r) \ne 0 f ′ ( r ) = 0 . Define
M = max x ∈ I ∣ f ′ ′ ( x ) ∣ min x ∈ I ∣ f ′ ( x ) ∣ M = \frac{\max_{x \in I} |f''(x)|}{\min_{x \in I} |f'(x)|} M = min x ∈ I ∣ f ′ ( x ) ∣ max x ∈ I ∣ f ′′ ( x ) ∣ We have
∣ r − x 1 ∣ ≤ M ∣ r − x 0 ∣ 2 ⟹ M ∣ r − x 1 ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 |r - x_1| \le M |r-x_0|^2 \qquad \Longrightarrow \qquad M |r - x_1| \le (M|r - x_0|)^2 ∣ r − x 1 ∣ ≤ M ∣ r − x 0 ∣ 2 ⟹ M ∣ r − x 1 ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 Choose x 0 x_0 x 0 such that
∣ r − x 0 ∣ ≤ δ and M ∣ r − x 0 ∣ < 1 |r - x_0| \le \delta \qquad\mbox{and}\qquad M|r - x_0| < 1 ∣ r − x 0 ∣ ≤ δ and M ∣ r − x 0 ∣ < 1 ⟹ M ∣ r − x 1 ∣ < 1 \Longrightarrow \qquad M|r - x_1| < 1 ⟹ M ∣ r − x 1 ∣ < 1 Moreover
∣ r − x 1 ∣ ≤ M ∣ r − x 0 ∣ ⏟ < 1 ⋅ ∣ r − x 0 ∣ < ∣ r − x 0 ∣ ≤ δ |r - x_1| \le \underbrace{M |r - x_0|}_{< 1} \cdot |r - x_0| < |r - x_0| \le \delta ∣ r − x 1 ∣ ≤ < 1 M ∣ r − x 0 ∣ ⋅ ∣ r − x 0 ∣ < ∣ r − x 0 ∣ ≤ δ Hence x 1 ∈ I x_1 \in I x 1 ∈ I . Applying the same argument inductively to x 2 , x 3 , … x_2, x_3, \ldots x 2 , x 3 , … we obtain
∀ n ≥ 0 { ∣ r − x n ∣ ≤ δ ⟹ x n ∈ I M ∣ r − x n ∣ < 1 \forall n \ge 0 \qquad \begin{cases}
|r - x_n| \le \delta \qquad \Longrightarrow \qquad x_n \in I \\
M|r - x_n| < 1
\end{cases} ∀ n ≥ 0 { ∣ r − x n ∣ ≤ δ ⟹ x n ∈ I M ∣ r − x n ∣ < 1 To show convergence, we write
∣ r − x n + 1 ∣ ≤ M ∣ r − x n ∣ 2 ⟹ M ∣ r − x n + 1 ∣ ≤ ( M ∣ r − x n ∣ ) 2 |r - x_{n+1}| \le M |r - x_n|^2 \qquad \Longrightarrow \qquad M|r - x_{n+1}| \le (M |r- x_n|)^2 ∣ r − x n + 1 ∣ ≤ M ∣ r − x n ∣ 2 ⟹ M ∣ r − x n + 1 ∣ ≤ ( M ∣ r − x n ∣ ) 2 and inductively
M ∣ r − x n ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 n M|r - x_n| \le (M |r - x_0|)^{2^n} M ∣ r − x n ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 n Now taking limit
lim n → ∞ ∣ r − x n ∣ ≤ 1 M lim n → ∞ ( M ∣ r − x 0 ∣ ) 2 n = 0 \lim_{n \to \infty} |r - x_n| \le \frac{1}{M} \lim_{n \to \infty} (M |r - x_0|)^{2^n} = 0 n → ∞ lim ∣ r − x n ∣ ≤ M 1 n → ∞ lim ( M ∣ r − x 0 ∣ ) 2 n = 0 since M ∣ r − x 0 ∣ < 1 M|r-x_0| < 1 M ∣ r − x 0 ∣ < 1 , which proves the convergence.
(2) In the error formula
r − x n + 1 = − ( r − x n ) 2 f ′ ′ ( ξ n ) 2 f ′ ( x n ) r - x_{n+1} = - (r - x_n)^2 \frac{f''(\xi_n)}{2 f'(x_n)} r − x n + 1 = − ( r − x n ) 2 2 f ′ ( x n ) f ′′ ( ξ n ) ξ n \xi_n ξ n is between r r r and x n x_n x n and hence ξ n → r \xi_n \to r ξ n → r . Thus
lim n → ∞ r − x n + 1 ( r − x n ) 2 = − lim n → ∞ f ′ ′ ( ξ n ) 2 f ′ ( x n ) = − f ′ ′ ( r ) 2 f ′ ( r ) \lim_{n \to \infty} \frac{r - x_{n+1}}{(r - x_n)^2} = -\lim_{n \to \infty}
\frac{f''(\xi_n)}{2 f'(x_n)} = - \frac{f''(r)}{2f'(r)} n → ∞ lim ( r − x n ) 2 r − x n + 1 = − n → ∞ lim 2 f ′ ( x n ) f ′′ ( ξ n ) = − 2 f ′ ( r ) f ′′ ( r ) 3.4.3 Error estimate ¶ From mean value theorem
f ( x n ) = f ( x n ) − f ( r ) = f ′ ( ξ n ) ( x n − r ) f(x_n) = f(x_n) - f(r) = f'(\xi_n) (x_n - r) f ( x n ) = f ( x n ) − f ( r ) = f ′ ( ξ n ) ( x n − r ) where ξ n \xi_n ξ n is between x n x_n x n and r r r , so that
r − x n = − f ( x n ) f ′ ( ξ n ) r - x_n = - \frac{f(x_n)}{f'(\xi_n)} r − x n = − f ′ ( ξ n ) f ( x n ) If f ′ f' f ′ is not changing rapidly between x n x_n x n and r r r then
f ′ ( ξ n ) ≈ f ′ ( x n ) f'(\xi_n) \approx f'(x_n) f ′ ( ξ n ) ≈ f ′ ( x n ) We get the error estimate
r − x n ≈ − f ( x n ) f ′ ( x n ) = x n + 1 − x n r - x_n \approx - \frac{f(x_n)}{f'(x_n)} = x_{n+1} - x_n r − x n ≈ − f ′ ( x n ) f ( x n ) = x n + 1 − x n and the relative error estimate
r − x n r ≈ x n + 1 − x n x n + 1 \frac{r - x_n}{r} \approx \frac{x_{n+1} - x_n}{x_{n+1}} r r − x n ≈ x n + 1 x n + 1 − x n Hence we can use the test
∣ x n + 1 − x n ∣ < ϵ ∣ x n + 1 ∣ , 0 < ϵ ≪ 1 |x_{n+1} - x_n| < \epsilon |x_{n+1}|, \qquad 0 < \epsilon \ll 1 ∣ x n + 1 − x n ∣ < ϵ ∣ x n + 1 ∣ , 0 < ϵ ≪ 1 to decide on convergence. This is used in the code examples to stop the iterations.
Since
f ( x n ) = ( x n − r ) f ′ ( ξ n ) , ξ n between r and x n f(x_n) = (x_n - r) f'(\xi_n), \qquad \textrm{$\xi_n$ between $r$ and $x_n$} f ( x n ) = ( x n − r ) f ′ ( ξ n ) , ξ n between r and x n and if we stop when ∣ x n − r ∣ < ϵ |x_n - r| < \epsilon ∣ x n − r ∣ < ϵ , then
∣ f ( x n ) ∣ ≲ ϵ ∣ f ′ ( r ) ∣ |f(x_n)| \lesssim \epsilon |f'(r)| ∣ f ( x n ) ∣ ≲ ϵ ∣ f ′ ( r ) ∣ Thus even if the root is sufficiently accurate, depending on f ′ ( r ) f'(r) f ′ ( r ) the funcation value may not be small enough.
For a > 0 a > 0 a > 0 , the function
f ( x ) = x 2 − a f(x) = x^2 - a f ( x ) = x 2 − a has roots ± a \pm\sqrt{a} ± a . Newton’s method is
x k + 1 = 1 2 ( x k + a x k ) x_{k+1} = \half \left( x_k + \frac{a}{x_k} \right) x k + 1 = 2 1 ( x k + x k a ) which uses only basic arithmetic operations (but division is not very basic). Geometrically, it is easy to see that if x 0 > 0 x_0 > 0 x 0 > 0 , then we converge to the positive root and if x 0 < 0 x_0 < 0 x 0 < 0 , then we converge to the negative root.
Define the function and its derivative.
def f(x):
return x**2 - a
def df(x):
return 2.0*x
Now implement Newton method as a function.
def newton1(f,df,x0,M=100,eps=1.0e-15):
x = x0
for i in range(M):
dx = - f(x)/df(x)
x = x + dx
print('%3d %22.14e %22.14e %22.14e'%(i+1,x,x-r,abs(f(x))))
if abs(dx) < eps * abs(x):
return x
print('No convergence, current root = %e' % x)
For illustration purpose only, we are using the exact root in above function to show the convergence of the error. In practice, we do not know the exact root.
We call Newton method with an initial guess for the root.
x0 = 2.0
x = newton1(f,df,x0)
1 1.50000000000000e+00 8.57864376269049e-02 2.50000000000000e-01
2 1.41666666666667e+00 2.45310429357160e-03 6.94444444444464e-03
3 1.41421568627451e+00 2.12390141474117e-06 6.00730488287127e-06
4 1.41421356237469e+00 1.59472435257157e-12 4.51061410444709e-12
5 1.41421356237310e+00 0.00000000000000e+00 4.44089209850063e-16
6 1.41421356237309e+00 -2.22044604925031e-16 4.44089209850063e-16
From the last column, we observe that in each iteration, the number of correct decimal places doubles.
3.4.4 Implication of quadratic convergence ¶ What does quadratic convergence imply in practical terms ? For Newton method
e k + 1 ≤ M e k 2 e_{k+1} \le M e_k^2 e k + 1 ≤ M e k 2 Suppose M ≈ 1.0 M \approx 1.0 M ≈ 1.0 and e 0 = 1 0 − 1 e_0 = 10^{-1} e 0 = 1 0 − 1 . Then
e 1 ≈ 1 0 − 2 e 2 ≈ 1 0 − 4 e 3 ≈ 1 0 − 8 e 4 ≈ 1 0 − 16 (limit of double precision) \begin{align*}
& e_1 \approx 10^{-2} \\
& e_2 \approx 10^{-4} \\
& e_3 \approx 10^{-8} \\
& e_4 \approx 10^{-16} \qquad \textrm{(limit of double precision)}
\end{align*} e 1 ≈ 1 0 − 2 e 2 ≈ 1 0 − 4 e 3 ≈ 1 0 − 8 e 4 ≈ 1 0 − 16 (limit of double precision) If the root r = O ( 1 ) r = O(1) r = O ( 1 ) , then
x 1 accurate to 2 digits x 2 accurate to 4 digits x 3 accurate to 8 digits x 4 accurate to 16 digits \begin{align*}
& x_1 \textrm{ accurate to 2 digits} \\
& x_2 \textrm{ accurate to 4 digits} \\
& x_3 \textrm{ accurate to 8 digits} \\
& x_4 \textrm{ accurate to 16 digits}
\end{align*} x 1 accurate to 2 digits x 2 accurate to 4 digits x 3 accurate to 8 digits x 4 accurate to 16 digits Newton’s method doubles the number of accurate figures in each iteration. Of course this happens only after the method enters the quadratic convergence regime. In the first few iterations, there may be no reduction of error or less than quadratic reduction.
Observe the number of correct decimal places in the square root example, which roughly doubles in each iteration.
Given a real number a > 0 a > 0 a > 0 , we want to find its reciprocal. Since division cannot be performed exactly on a computer, we would like to find a method which uses only addition and multiplication. We can cook up a function whose root is the reciprocal, e.g.,
f ( x ) = 1 x − a f(x) = \frac{1}{x} - a f ( x ) = x 1 − a Applying Newton method to this function yields the iteration, see
figure.
x k + 1 = 2 x k − a x k 2 x_{k+1} = 2 x_k - a x_k^2 x k + 1 = 2 x k − a x k 2 which involves only subtraction and multiplication.
If we start with any 0 < x 0 ≤ 1 a 0 < x_0 \le \frac{1}{a} 0 < x 0 ≤ a 1 , then we converge to the root as we can verify geometrically. If we start with x 0 > 1 a x_0 > \frac{1}{a} x 0 > a 1 , but sufficiently close to it, then we again converge to the root. If a > 1 a > 1 a > 1 then x 1 ∈ ( 0 , 1 / a ] x_1 \in (0, 1/a] x 1 ∈ ( 0 , 1/ a ] iff x 0 ∈ [ 1 / a , 2 ) x_0 \in [1/a,2) x 0 ∈ [ 1/ a , 2 ) , and subsequent iterates will converge. If a > 1 a>1 a > 1 and x 0 > 2 x_0 > 2 x 0 > 2 , then x 1 < 0 x_1 < 0 x 1 < 0 after which we have divergence to − ∞ -\infty − ∞ . If 0 < a < 1 0 < a < 1 0 < a < 1 , then any initial guess in the interval ( 0 , 1 / a ] ∪ [ 1 / a , 2 / a ) (0,1/a] \cup [1/a, 2/ a) ( 0 , 1/ a ] ∪ [ 1/ a , 2/ a ) will converge to the desired root. Note that x 0 = a x_0 = a x 0 = a is a valid initial guess in this case.
Let us try for a = 1 0 − 10 a = 10^{-10} a = 1 0 − 10 and with an initial guess x 0 = a x_0 = a x 0 = a .
a = 1.0e-10
r = 1.0/a # exact root
def F(x):
return 1/x - a
def DF(x):
return -1/x**2
Now run Newton method.
M = 100 # maximum iterations
x0 = a # initial guess
eps = 1e-15 # relative tolerance on root
x = newton1(F,DF,x0)
1 2.00000000000000e-10 -1.00000000000000e+10 5.00000000000000e+09
2 4.00000000000000e-10 -1.00000000000000e+10 2.50000000000000e+09
3 8.00000000000000e-10 -1.00000000000000e+10 1.25000000000000e+09
4 1.60000000000000e-09 -1.00000000000000e+10 6.25000000000000e+08
5 3.20000000000000e-09 -1.00000000000000e+10 3.12500000000000e+08
6 6.40000000000000e-09 -1.00000000000000e+10 1.56250000000000e+08
7 1.28000000000000e-08 -1.00000000000000e+10 7.81250000000000e+07
8 2.56000000000000e-08 -1.00000000000000e+10 3.90625000000000e+07
9 5.12000000000000e-08 -1.00000000000000e+10 1.95312500000000e+07
10 1.02400000000000e-07 -1.00000000000000e+10 9.76562500000000e+06
11 2.04800000000000e-07 -1.00000000000000e+10 4.88281250000000e+06
12 4.09600000000000e-07 -1.00000000000000e+10 2.44140625000000e+06
13 8.19200000000000e-07 -1.00000000000000e+10 1.22070312500000e+06
14 1.63840000000000e-06 -1.00000000000000e+10 6.10351562500000e+05
15 3.27680000000000e-06 -1.00000000000000e+10 3.05175781250000e+05
16 6.55360000000000e-06 -9.99999999999999e+09 1.52587890625000e+05
17 1.31072000000000e-05 -9.99999999999999e+09 7.62939453124999e+04
18 2.62144000000000e-05 -9.99999999999997e+09 3.81469726562499e+04
19 5.24287999999999e-05 -9.99999999999995e+09 1.90734863281249e+04
20 1.04857599999999e-04 -9.99999999999990e+09 9.53674316406245e+03
21 2.09715199999998e-04 -9.99999999999979e+09 4.76837158203120e+03
22 4.19430399999991e-04 -9.99999999999958e+09 2.38418579101557e+03
23 8.38860799999965e-04 -9.99999999999916e+09 1.19209289550776e+03
24 1.67772159999986e-03 -9.99999999999832e+09 5.96046447753856e+02
25 3.35544319999944e-03 -9.99999999999664e+09 2.98023223876903e+02
26 6.71088639999775e-03 -9.99999999999329e+09 1.49011611938427e+02
27 1.34217727999910e-02 -9.99999999998658e+09 7.45058059691883e+01
28 2.68435455999640e-02 -9.99999999997316e+09 3.72529029845691e+01
29 5.36870911998559e-02 -9.99999999994631e+09 1.86264514922596e+01
30 1.07374182399424e-01 -9.99999999989263e+09 9.31322574610478e+00
31 2.14748364797694e-01 -9.99999999978525e+09 4.65661287302739e+00
32 4.29496729590777e-01 -9.99999999957050e+09 2.32830643648869e+00
33 8.58993459163107e-01 -9.99999999914101e+09 1.16415321821935e+00
34 1.71798691825243e+00 -9.99999999828201e+09 5.82076609084674e-01
35 3.43597383620971e+00 -9.99999999656403e+09 2.91038304517337e-01
36 6.87194767123882e+00 -9.99999999312805e+09 1.45519152233668e-01
37 1.37438953377553e+01 -9.99999998625611e+09 7.27595760918342e-02
38 2.74877906566211e+01 -9.99999997251221e+09 3.63797880209171e-02
39 5.49755812376843e+01 -9.99999994502442e+09 1.81898939854586e-02
40 1.09951162173137e+02 -9.99999989004884e+09 9.09494696772928e-03
41 2.19902323137349e+02 -9.99999978009768e+09 4.54747345886464e-03
42 4.39804641438994e+02 -9.99999956019536e+09 2.27373670443232e-03
43 8.79609263535176e+02 -9.99999912039074e+09 1.13686832721616e-03
44 1.75921844969911e+03 -9.99999824078155e+09 5.68434138608081e-04
45 3.51843658991326e+03 -9.99999648156341e+09 2.84217044304043e-04
46 7.03687194188691e+03 -9.99999296312806e+09 1.42108497152026e-04
47 1.40737389320171e+04 -9.99998592626107e+09 7.10542235760217e-05
48 2.81474580570215e+04 -9.99997185254194e+09 3.55270867880284e-05
49 5.62948368861036e+04 -9.99994370516311e+09 1.77635183940494e-05
50 1.12589356861341e+05 -9.99988741064314e+09 8.88173419709507e-06
51 2.25177446086354e+05 -9.99977482255391e+09 4.44084209868827e-06
52 4.50349821684486e+05 -9.99954965017832e+09 2.22039604962561e-06
53 9.00679361872783e+05 -9.99909932063813e+09 1.11017302537576e-06
54 1.80127760141428e+06 -9.99819872239859e+09 5.55061513813778e-07
55 3.60223074272882e+06 -9.99639776925727e+09 2.77505759158689e-07
56 7.20316387882525e+06 -9.99279683612117e+09 1.38727884082944e-07
57 1.44011392006640e+07 -9.98559886079934e+09 6.93389510486708e-08
58 2.87815391203003e+07 -9.97121846087970e+09 3.46444935387308e-08
59 5.74802405411872e+07 -9.94251975945881e+09 1.72972827981375e-08
60 1.14630083277107e+08 -9.88536991672289e+09 8.62371345646323e-09
61 2.27946160955002e+08 -9.77205383904500e+09 4.28700084182336e-09
62 4.50696376680593e+08 -9.54930362331941e+09 2.11878863851772e-09
63 8.81080030965884e+08 -9.11891996903412e+09 1.03497067786652e-09
64 1.68452985983508e+09 -8.31547014016492e+09 4.93637443801620e-10
65 3.08529563480257e+09 -6.91470436519743e+09 2.24118048435897e-10
66 5.21868635419195e+09 -4.78131364580805e+09 9.16191033777574e-11
67 7.71390398204098e+09 -2.28609601795902e+09 2.96360445149611e-11
68 9.47737649966719e+09 -5.22623500332809e+08 5.51443218860368e-12
69 9.97268646768999e+09 -2.73135323100128e+07 2.73883395397065e-13
70 9.99992539709528e+09 -7.46029047241211e+04 7.46034612872794e-16
71 9.99999999944344e+09 -5.56560516357422e-01 5.56560720338062e-21
72 1.00000000000000e+10 0.00000000000000e+00 0.00000000000000e+00
73 1.00000000000000e+10 0.00000000000000e+00 0.00000000000000e+00
The third column shows error in root, which hardly decreases for about 60 iterations, and then there is rapid reduction.
Example 6 (Slow convergence)
The quadratic convergence of Newton method is obtained only when we are sufficiently close to the root. It may happen that initially, the convergence is very slow. Consider the reciprocal example with a = 1 0 − 10 a = 10^{-10} a = 1 0 − 10 and let us start with x 0 = a x_0 = a x 0 = a for which the method converges but this initial guess is too far from the root. The first few iterates are
x 0 = 1 0 − 10 x 1 = 2 ⋅ 1 0 − 10 − 1 0 − 30 = 2 ⋅ 1 0 − 10 = 2 x 0 x 2 = 2 ⋅ 2 ⋅ 1 0 − 10 − 4 ⋅ 1 0 − 30 = 2 2 ⋅ 1 0 − 10 = 2 x 1 \begin{aligned}
x_0 &= 10^{-10} \\
x_1 &= 2 \cdot 10^{-10} - 10^{-30} = 2 \cdot 10^{-10} = 2 x_0 \\
x_2 &= 2 \cdot 2 \cdot 10^{-10} - 4 \cdot 10^{-30} = 2^2 \cdot 10^{-10} = 2 x_1
\end{aligned} x 0 x 1 x 2 = 1 0 − 10 = 2 ⋅ 1 0 − 10 − 1 0 − 30 = 2 ⋅ 1 0 − 10 = 2 x 0 = 2 ⋅ 2 ⋅ 1 0 − 10 − 4 ⋅ 1 0 − 30 = 2 2 ⋅ 1 0 − 10 = 2 x 1 Thus initially, the iterates double in each step and the change per step is very small. This slow change will continue until we get close to the root, i.e.,
2 k ⋅ 1 0 − 10 ≈ 1 0 10 2^k \cdot 10^{-10} \approx 10^{10} 2 k ⋅ 1 0 − 10 ≈ 1 0 10 which needs k ≈ 67 k \approx 67 k ≈ 67 iterations, after which we have rapid convergence.
3.4.5 Inflection point ¶ If we have a function with an inflection point, see figure, then the Newton iterates may cycle indefinitely without any convergence. The Newton method is
Δ x k = − f ( x k ) f ′ ( x k ) , x k + 1 = x k + Δ x k \Delta x_k = - \frac{f(x_k)}{f'(x_k)}, \qquad x_{k+1} = x_k + \Delta x_k Δ x k = − f ′ ( x k ) f ( x k ) , x k + 1 = x k + Δ x k We can use ∣ f ( x ) ∣ |f(x)| ∣ f ( x ) ∣ as a metric to measure convergence towards the root: ∣ f ( x k ) ∣ |f(x_k)| ∣ f ( x k ) ∣ must decrease monotonically towards zero. We will accept the step Δ x k \Delta x_k Δ x k if ∣ f ( x k + 1 ) ∣ < ∣ f ( x k ) ∣ | f(x_{k+1})| < |f(x_k)| ∣ f ( x k + 1 ) ∣ < ∣ f ( x k ) ∣ , otherwise halve the step size Δ x k ← 1 2 Δ x k \Delta x_k \leftarrow \half \Delta x_k Δ x k ← 2 1 Δ x k . We can halve the step size repeatedly until reduction in ∣ f ( x ) ∣ |f(x)| ∣ f ( x ) ∣ is obtained; such a step size always exists. This can be summarised in this algorithm.
Δ x = − f ( x ) / f ′ ( x ) \Delta x = -f(x)/f'(x) Δ x = − f ( x ) / f ′ ( x ) While ∣ f ( x + Δ x ) ∣ > ∣ f ( x ) ∣ |f(x+\Delta x)| > |f(x)| ∣ f ( x + Δ x ) ∣ > ∣ f ( x ) ∣ Δ x = Δ x / 2 \Delta x = \Delta x/2 Δ x = Δ x /2 Moreover, to prevent wild oscillations of the iterates, the step size can also be restricted, e.g., dont allow step size to more than double in successive steps
∣ Δ x k ∣ ≤ 2 ∣ Δ x k − 1 ∣ |\Delta x_k| \le 2 |\Delta x_{k-1}| ∣Δ x k ∣ ≤ 2∣Δ x k − 1 ∣ Example 7 (Complex roots)
Newton method can converge to a complex root if we start with a complex initial guess. The roots of
f ( x ) = x 5 + 1 f(x) = x^5 + 1 f ( x ) = x 5 + 1 lie on the unit circle in the complex plane. We can write
− 1 = exp ( i ( 2 n + 1 ) π ) , n = 0 , 1 , 2 , … -1 = \exp(\ii (2n+1)\pi), \qquad n=0,1,2,\ldots − 1 = exp ( i ( 2 n + 1 ) π ) , n = 0 , 1 , 2 , … The fifth roots are
( − 1 ) 1 / 5 = exp ( i ( 2 n + 1 ) π / 5 ) , n = 0 , 1 , 2 , 3 , 4 (-1)^{1/5} = \exp(\ii (2n+1)\pi/5), \qquad n=0,1,2,3,4 ( − 1 ) 1/5 = exp ( i ( 2 n + 1 ) π /5 ) , n = 0 , 1 , 2 , 3 , 4 Let us compute the exact roots and plot them on a graph.
n = array([0,1,2,3,4])
r = exp(1j*(2*n+1)*pi/5) # 5'th roots of -1
plot(real(r),imag(r),'o')
title('Fifth roots of -1')
grid(True), xlabel('real'), ylabel('imag'), axis('equal')
for x in r:
print(x,x**5)
(0.8090169943749475+0.5877852522924731j) (-1.0000000000000002+0j)
(-0.30901699437494734+0.9510565162951536j) (-1+4.440892098500626e-16j)
(-1+1.2246467991473532e-16j) (-1+6.123233995736766e-16j)
(-0.30901699437494756-0.9510565162951535j) (-1+7.216449660063518e-16j)
(0.8090169943749473-0.5877852522924734j) (-1.0000000000000004+1.1102230246251565e-15j)
Let us find a root using Newton method with starting guess 1 + i 1 + \ii 1 + i .
f = lambda x: x**5 + 1.0
df = lambda x: 5.0 * x**4
def newton2(f,df,x0,M=100,eps=1.0e-15):
x = x0
for i in range(M):
dx = - f(x)/df(x)
x = x + dx
print(i,x,abs(f(x)))
if abs(dx) < eps * abs(x):
return x
x0 = 1.0 + 1.0j
x = newton2(f,df,x0)
print('Root = ',x)
print('x**5 = ',x**5)
0 (0.85+0.8j) 1.4844926068747277
1 (0.7869450486819533+0.6530228729743794j) 0.3587868496977192
2 (0.8000098256925355+0.5887168663596131j) 0.04466991909204323
3 (0.8091271135998733+0.587660739881959j) 0.0008311338875735857
4 (0.8090169566882729+0.5877852118808937j) 2.762870121656608e-07
5 (0.8090169943749507+0.5877852522924782j) 3.014107274014632e-14
6 (0.8090169943749475+0.5877852522924731j) 2.220446049250313e-16
7 (0.8090169943749475+0.5877852522924731j) 2.220446049250313e-16
Root = (0.8090169943749475+0.5877852522924731j)
x**5 = (-1.0000000000000002+0j)
We do converge to a root, but which one will depend on the initial guess. Moreover, we can only get one root.
3.4.6 Implicit functions ¶ Suppose we have a function y = g ( x ) : R → R y=g(x) : \re \to \re y = g ( x ) : R → R , which is defined implicitly by
G : R × R → R , G ( x , y ) = 0 G: \re \times \re \to \re, \qquad G(x,y) = 0 G : R × R → R , G ( x , y ) = 0 We may not be able find an explicit expression g ( x ) g(x) g ( x ) , instead we find the value of y y y for some discrete set of x x x values. This leads to a root finding problem for each x x x and we can apply Newton method.
In this context, recall the implicit function theorem.
Let G : R m × R n → R n G : \re^m \times \re^n \to \re^n G : R m × R n → R n and G ( x 0 , y 0 ) = 0 G(x_0, y_0) = 0 G ( x 0 , y 0 ) = 0 with G y ( x 0 , y 0 ) ∈ R n × n G_y(x_0, y_0) \in \re^{n \times n} G y ( x 0 , y 0 ) ∈ R n × n invertible. Then there exists a neighbourhood U ⊂ R m U \subset \re^m U ⊂ R m of x 0 x_0 x 0 and a unique, continuously differentiable function g : R m → R n g : \re^m \to \re^n g : R m → R n such that g ( x 0 ) = y 0 g(x_0) = y_0 g ( x 0 ) = y 0 and G ( x , g ( x ) ) = 0 G(x, g(x)) = 0 G ( x , g ( x )) = 0 for x ∈ U x \in U x ∈ U .
For a given value of x x x and make an initial guess y 0 y_0 y 0 . This may not satisfy G ( x , y 0 ) = 0 G(x,y_0) = 0 G ( x , y 0 ) = 0 so we want to find a correction y 0 + Δ y 0 y_0 + \Delta y_0 y 0 + Δ y 0 such that G ( x , y 0 + Δ y 0 ) = 0 G(x,y_0+\Delta y_0) = 0 G ( x , y 0 + Δ y 0 ) = 0 . By Taylor expansion wrt second argument of G G G
G ( x , y 0 ) + G y ( x , y 0 ) Δ y 0 = 0 ⟹ Δ y 0 = − G ( x , y 0 ) G y ( x , y 0 ) G(x,y_0) + G_y(x,y_0) \Delta y_0 = 0 \quad\Longrightarrow\quad \Delta y_0 = -
\frac{G(x,y_0)}{G_y(x,y_0)} G ( x , y 0 ) + G y ( x , y 0 ) Δ y 0 = 0 ⟹ Δ y 0 = − G y ( x , y 0 ) G ( x , y 0 ) and hence we get an improved estimate
y 1 = y 0 + Δ y 0 = y 0 − G ( x , y 0 ) G y ( x , y 0 ) y_1 = y_0 + \Delta y_0 = y_0 - \frac{G(x,y_0)}{G_y(x,y_0)} y 1 = y 0 + Δ y 0 = y 0 − G y ( x , y 0 ) G ( x , y 0 ) This can be used to generate an iterative scheme
y k + 1 = y k − G ( x , y k ) G y ( x , y k ) , k = 0 , 1 , 2 , … y_{k+1} = y_k - \frac{G(x,y_k)}{G_y(x,y_k)}, \qquad k=0,1,2,\ldots y k + 1 = y k − G y ( x , y k ) G ( x , y k ) , k = 0 , 1 , 2 , … If we have to solve this problem for several values of x x x , then we can use the previous solution as an initial guess for the root, which can speed up the convergence of Newton method.
3.4.7 System of equations ¶ Consider a function f : R N → R N f : \re^N \to \re^N f : R N → R N such that for x ∈ R N x \in \re^N x ∈ R N
f ( x ) = [ f 1 ( x ) , … , f N ( x ) ] ⊤ ∈ R N f(x) = [f_1(x), \ldots, f_N(x)]^\top \in \re^N f ( x ) = [ f 1 ( x ) , … , f N ( x ) ] ⊤ ∈ R N We want to find an r ∈ R N r \in \re^N r ∈ R N such that f ( r ) = 0 f(r) = 0 f ( r ) = 0 . This is a system of N N N non-linear equations for the N N N unknowns. We start with an initial guess x 0 ∈ R N x^0 \in \re^N x 0 ∈ R N and in general f ( x 0 ) ≠ 0 f(x^0) \ne 0 f ( x 0 ) = 0 . We will try to find a correction Δ x 0 \Delta x^0 Δ x 0 to x 0 x^0 x 0 such that
f ( x 0 + Δ x 0 ) = 0 f(x^0 + \Delta x^0) = 0 f ( x 0 + Δ x 0 ) = 0 Approximating by the first two terms of the Taylor expansion
f ( x 0 ) + f ′ ( x 0 ) Δ x 0 = 0 f(x^0) + f'(x^0) \Delta x^0 = 0 f ( x 0 ) + f ′ ( x 0 ) Δ x 0 = 0 we can find the correction by solving the matrix problem
f ′ ( x 0 ) Δ x 0 = − f ( x 0 ) f'(x^0) \Delta x^0 = - f(x^0) f ′ ( x 0 ) Δ x 0 = − f ( x 0 ) where f ′ ∈ R N × N f' \in \re^{N \times N} f ′ ∈ R N × N is given by
[ f ′ ] i j = ∂ f i ∂ x j [f']_{ij} = \df{f_i}{x_j} [ f ′ ] ij = ∂ x j ∂ f i Once we solve for Δ x 0 = − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) \Delta x^0 = -[f'(x^0)]^{-1} f(x^0) Δ x 0 = − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) , we get a new estimate of the root
x 1 = x 0 + Δ x 0 = x 0 − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) x^1 = x^0 + \Delta x^0 = x^0 - [f'(x^0)]^{-1} f(x^0) x 1 = x 0 + Δ x 0 = x 0 − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) The above steps are applied iteratively. Choose an initial guess x 0 x^0 x 0 and set k = 0 k=0 k = 0 .
Solve the matrix problem
f ′ ( x k ) Δ x k = − f ( x k ) f'(x^k) \Delta x^k = - f(x^k) f ′ ( x k ) Δ x k = − f ( x k ) Update the root
x k + 1 = x k + Δ x k x^{k+1} = x^k + \Delta x^k x k + 1 = x k + Δ x k k = k + 1 k=k+1 k = k + 1
If not converged, go to Step 1.
Example 8 (Newton method for system)
For x = ( x 0 , x 1 , x 2 ) ∈ R 3 x = (x_0,x_1,x_2) \in \re^3 x = ( x 0 , x 1 , x 2 ) ∈ R 3 , define f : R 3 → R 3 f : \re^3 \to \re^3 f : R 3 → R 3 by
f ( x ) = [ x 0 x 1 − x 2 2 − 1 x 0 x 1 x 2 − x 0 2 + x 1 2 − 2 exp ( x 0 ) − exp ( x 1 ) + x 2 − 3 ] f(x) = \begin{bmatrix}
x_0 x_1 - x_2^2 - 1 \\
x_0 x_1 x_2 - x_0^2 + x_1^2 - 2 \\
\exp(x_0) - \exp(x_1) + x_2 - 3
\end{bmatrix} f ( x ) = ⎣ ⎡ x 0 x 1 − x 2 2 − 1 x 0 x 1 x 2 − x 0 2 + x 1 2 − 2 exp ( x 0 ) − exp ( x 1 ) + x 2 − 3 ⎦ ⎤ The gradient of f f f is given by
f ′ ( x ) = [ x 1 x 0 − 2 x 2 x 1 x 2 − 2 x 0 x 0 x 2 + 2 x 1 x 0 x 1 exp ( x 0 ) − exp ( x 1 ) 1 ] f'(x) = \begin{bmatrix}
x_1 & x_0 & -2 x_2 \\
x_1 x_2 - 2 x_0 & x_0 x_2 + 2 x_1 & x_0 x_1 \\
\exp(x_0) & -\exp(x_1) & 1
\end{bmatrix} f ′ ( x ) = ⎣ ⎡ x 1 x 1 x 2 − 2 x 0 exp ( x 0 ) x 0 x 0 x 2 + 2 x 1 − exp ( x 1 ) − 2 x 2 x 0 x 1 1 ⎦ ⎤ We implement functions for these.
def f(x):
y = zeros(3)
y[0] = x[0]*x[1] - x[2]**2 - 1.0
y[1] = x[0]*x[1]*x[2] - x[0]**2 + x[1]**2 - 2.0
y[2] = exp(x[0]) - exp(x[1]) + x[2] - 3.0
return y
def df(x):
y = array([[x[1], x[0], -2.0*x[2]],
[x[1]*x[2]-2.0*x[0], x[0]*x[2]+2.0*x[1], x[0]*x[1]],
[exp(x[0]), -exp(x[1]), 1.0]])
return y
The Newton method uses matrix solver from Numpy.
def newton3(fun,dfun,x0,M=100,eps=1.0e-14,debug=False):
x = x0.copy()
for i in range(M):
g = fun(x)
J = dfun(x)
h = solve(J,-g)
x = x + h
if debug:
print(i,x,norm(g))
if norm(h) < eps * norm(x):
return x
Now we solve
x0 = array([1.0,1.0,1.0])
x = newton3(f,df,x0,debug=True)
print('Root = ',x)
print('f(x) = ',f(x))
0 [2.1893261 1.59847516 1.39390063] 2.449489742783178
1 [1.85058965 1.44425142 1.278224 ] 2.5243835363236373
2 [1.7801612 1.42443598 1.23929244] 0.4123361696320246
3 [1.77767471 1.42396093 1.23747382] 0.014812983951911496
4 [1.77767192 1.4239606 1.23747112] 1.8310659761129026e-05
5 [1.77767192 1.4239606 1.23747112] 2.4379428075383574e-11
6 [1.77767192 1.4239606 1.23747112] 6.661338147750939e-16
Root = [1.77767192 1.4239606 1.23747112]
f(x) = [ 2.22044605e-16 -4.44089210e-16 4.44089210e-16]