#%config InlineBackend.figure_format = 'svg'
from pylab import *
import sympy
The Newton-Raphson method is an iterative method to find the roots of a function. It starts with an initial guess for the root and tries to improve it. Given an initial guess x 0 x_0 x 0 for the root, we have
f ( x 0 ) ≠ 0 f(x_0) \ne 0 f ( x 0 ) = 0 Let us try to find a correction Δ x 0 \Delta x_0 Δ x 0 which brings it closer to the root
f ( x 0 + Δ x 0 ) = 0 f(x_0 + \Delta x_0) = 0 f ( x 0 + Δ x 0 ) = 0 Using Taylor formula, and truncating it at the linear term
f ( x 0 ) + Δ x 0 f ′ ( x 0 ) ≈ 0 ⟹ Δ x 0 ≈ − f ( x 0 ) f ′ ( x 0 ) f(x_0) + \Delta x_0 f'(x_0) \approx 0 \qquad \Longrightarrow \qquad \Delta x_0 \approx -
\frac{f(x_0)}{f'(x_0)} f ( x 0 ) + Δ x 0 f ′ ( x 0 ) ≈ 0 ⟹ Δ x 0 ≈ − f ′ ( x 0 ) f ( x 0 ) We obtain a {\em better} estimate for the root
x 1 = x 0 + Δ x 0 = x 0 − f ( x 0 ) f ′ ( x 0 ) x_1 = x_0 + \Delta x_0 = x_0 - \frac{f(x_0)}{f'(x_0)} x 1 = x 0 + Δ x 0 = x 0 − f ′ ( x 0 ) f ( x 0 ) Geometrically, we are approximating the function by the tangent line at x 0 x_0 x 0 which is given by
y = f ( x 0 ) + ( x − x 0 ) f ′ ( x 0 ) y = f(x_0) + (x-x_0) f'(x_0) y = f ( x 0 ) + ( x − x 0 ) f ′ ( x 0 ) and finding the zero of the tangent line, see figure. We now repeat the same process for x 1 , x 2 , … x_1, x_2, \ldots x 1 , x 2 , … until some convergence is achieved. The iterations are given by
x n + 1 = x n − f ( x n ) f ′ ( x n ) , n = 0 , 1 , 2 , … x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}, \qquad n=0,1,2,\ldots x n + 1 = x n − f ′ ( x n ) f ( x n ) , n = 0 , 1 , 2 , … This algorithm requires that f f f is differentiable and f ′ f' f ′ is not zero in a neighbourhood of the root. The algorithm for Newton-Raphson method is illustrated in Algorithm.
Example 1 (Affine function)
Consider an affine function f ( x ) = a x + b f(x) = ax + b f ( x ) = a x + b . Let x 0 x_0 x 0 be any initial
guess. The Newton iterations are yield
x 1 = x 0 − a x 0 + b a = − b a x_1 = x_0 - \frac{ax_0 + b}{a} = -\frac{b}{a} x 1 = x 0 − a a x 0 + b = − a b so that we converge to the root in one iteration.
Find the root of
f ( x ) = exp ( x ) − 3 2 − arctan x f(x) = \exp(x) - \frac{3}{2} - \arctan{x} f ( x ) = exp ( x ) − 2 3 − arctan x which is in the interval [ 1 2 , 1 ] [\shalf,1] [ 2 1 , 1 ] .
We define the function using sympy and automatically compute its derivative. Then we make functions out of these.
x = sympy.Symbol('x')
# Define the function
f = sympy.exp(x)-3.0/2.0-sympy.atan(x)
# Compute its derivative
df = sympy.diff(f,x)
print("f = ", f)
print("df= ", df)
# Make functions
F = sympy.lambdify(x, f, modules=['numpy'])
DF = sympy.lambdify(x, df, modules=['numpy'])
# Plot the functions
X = linspace(-1,1,100)
figure(figsize=(9,4))
subplot(1,2,1), plot(X,F(X)), grid(True), title('Function')
subplot(1,2,2), plot(X,DF(X)), grid(True), title('Derivative');
f = exp(x) - atan(x) - 1.5
df= exp(x) - 1/(x**2 + 1)
Now we implement the Newton method. We have to specify the maximum number of iterations, an initial guess and a tolerance to decide when to stop.
M = 20 # maximum iterations
x = 0.5 # initial guess
eps = 1e-14 # relative tolerance on root
f = F(x)
for i in range(M):
df = DF(x)
dx = -f/df
x = x + dx
e = abs(dx)
f = F(x)
print("%6d %22.14e %22.14e %22.14e" % (i,x,e,abs(f)))
if e < eps * abs(x):
break
0 8.71059792151655e-01 3.71059792151655e-01 1.72847808769943e-01
1 7.76133043351671e-01 9.49267487999836e-02 1.30353403123729e-02
2 7.67717620302437e-01 8.41542304923425e-03 9.81774270505387e-05
3 7.67653269950858e-01 6.43503515784806e-05 5.71995606435394e-09
4 7.67653266201279e-01 3.74957960683442e-09 2.22044604925031e-16
5 7.67653266201279e-01 1.45556000556775e-16 0.00000000000000e+00
Example 3 (Absolute and relative tolerance)
We consider the same function as in a previous example, but we have shifted it so that the root lies at a large distance from zero,
f ( x ) = exp ( x − x 0 ) − 3 2 − arctan ( x − x 0 ) f(x) = \exp(x-x_0) - \frac{3}{2} - \arctan(x-x_0) f ( x ) = exp ( x − x 0 ) − 2 3 − arctan ( x − x 0 ) where x 0 x_0 x 0 is some large number, e.g., x 0 = 1 0 10 x_0 = 10^{10} x 0 = 1 0 10 . The root lies around x 0 x_0 x 0 and due to precision of floating point numbers, we cannot compute it to good absolute tolerance, but only to relative tolerance.
Define the function and its derivative.
x0 = 1e10
# Define the function
F = lambda x: exp(x-x0)-3.0/2.0-atan(x-x0)
# Compute its derivative
DF = lambda x: exp(x-x0) - 1.0/(1 + (x-x0)**2)
Now we implement the Newton method. We have to specify the maximum number of iterations, an initial guess and a tolerance to decide when to stop.
M = 50 # maximum iterations
eps = 1e-14 # relative tolerance on root
Using relative tolerance on the root
x = x0 + 0.5 # initial guess
f = F(x)
for i in range(M):
df = DF(x)
dx = -f/df
x = x + dx
e = abs(dx)
f = F(x)
print("%6d %22.14e %22.14e %22.14e" % (i,x,e,abs(f)))
if e < eps * abs(x):
break
0 1.00000000008711e+10 3.71059792151655e-01 1.72847126992936e-01
1 1.00000000007761e+10 9.49264320087584e-02 1.30346281936581e-02
2 1.00000000007677e+10 8.41497025309343e-03 9.77825039385483e-05
3 1.00000000007677e+10 6.40915294389154e-05 1.15114296217467e-06
Root converges in relative sense to machine zero in few iterations.
Using absolute tolerance on the root
x = x0 + 0.5 # initial guess
f = F(x)
for i in range(M):
df = DF(x)
dx = -f/df
x = x + dx
e = abs(dx)
f = F(x)
print("%6d %22.14e %22.14e %22.14e" % (i,x,e,abs(f)))
if e < eps:
break
0 1.00000000008711e+10 3.71059792151655e-01 1.72847126992936e-01
1 1.00000000007761e+10 9.49264320087584e-02 1.30346281936581e-02
2 1.00000000007677e+10 8.41497025309343e-03 9.77825039385483e-05
3 1.00000000007677e+10 6.40915294389154e-05 1.15114296217467e-06
4 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
5 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
6 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
7 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
8 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
9 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
10 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
11 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
12 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
13 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
14 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
15 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
16 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
17 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
18 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
19 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
20 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
21 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
22 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
23 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
24 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
25 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
26 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
27 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
28 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
29 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
30 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
31 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
32 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
33 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
34 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
35 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
36 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
37 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
38 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
39 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
40 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
41 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
42 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
43 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
44 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
45 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
46 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
47 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
48 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
49 1.00000000007677e+10 7.54605114902618e-07 1.15114296217467e-06
There is no convergence. The root cannot be computed to good absolute accuracy, and consequently, we cannot make f ( x ) f(x) f ( x ) sufficiently small. We could reduce it to about 10-6 which is also achieved using relative tolerance as stopping criterion.
1 Convergence analysis ¶ Let r r r be a root of f f f . From Taylor formula with remainder term
0 = f ( r ) = f ( x n ) + ( r − x n ) f ′ ( x n ) + 1 2 ( r − x n ) 2 f ′ ′ ( ξ n ) , ξ n between r and x n 0 = f(r) = f(x_n) + (r - x_n) f'(x_n) + \half (r - x_n)^2 f''(\xi_n), \qquad \xi_n \mbox{ between $r$ and $x_n$} 0 = f ( r ) = f ( x n ) + ( r − x n ) f ′ ( x n ) + 2 1 ( r − x n ) 2 f ′′ ( ξ n ) , ξ n between r and x n Solve for r r r from the linear term
r = x n − f ( x n ) f ′ ( x n ) ⏟ x n + 1 − 1 2 ( r − x n ) 2 f ′ ′ ( ξ n ) f ′ ( x n ) r = \underbrace{x_n - \frac{f(x_n)}{f'(x_n)}}_{x_{n+1}} - \half (r - x_n)^2
\frac{f''(\xi_n)}{f'(x_n)} r = x n + 1 x n − f ′ ( x n ) f ( x n ) − 2 1 ( r − x n ) 2 f ′ ( x n ) f ′′ ( ξ n ) The error in the root is
r − x n + 1 = − ( r − x n ) 2 f ′ ′ ( ξ n ) 2 f ′ ( x n ) r - x_{n+1} = - (r - x_n)^2 \frac{f''(\xi_n)}{2 f'(x_n)} r − x n + 1 = − ( r − x n ) 2 2 f ′ ( x n ) f ′′ ( ξ n ) Atleast in a neighborhood of the root, we will assume that
∣ f ′ ′ ( ξ n ) 2 f ′ ( x n ) ∣ ≤ C \left| \frac{f''(\xi_n)}{2 f'(x_n)} \right| \le C ∣ ∣ 2 f ′ ( x n ) f ′′ ( ξ n ) ∣ ∣ ≤ C Then the error e n = ∣ r − x n ∣ e_n = |r - x_n| e n = ∣ r − x n ∣ satisfies
e n + 1 ≤ C e n 2 e_{n+1} \le C e_n^2 e n + 1 ≤ C e n 2 To show convergence x n → r x_n \to r x n → r , we have to show that e n → 0 e_n \to 0 e n → 0 .
Theorem 1 (Convergence of Newton method)
Assume that f , f ′ , f ′ ′ f,f',f'' f , f ′ , f ′′ are continuous in some neighbourhood of the root r r r , and f ′ ( r ) ≠ 0 f'(r) \ne 0 f ′ ( r ) = 0 . If x 0 x_0 x 0 is chosen sufficiently close to r r r , then x n → r x_n \to r x n → r . Moreover
lim n → ∞ r − x n + 1 ( r − x n ) 2 = − f ′ ′ ( r ) 2 f ′ ( r ) \lim_{n \to \infty} \frac{r - x_{n+1}}{(r - x_n)^2} = - \frac{f''(r)}{2f'(r)} n → ∞ lim ( r − x n ) 2 r − x n + 1 = − 2 f ′ ( r ) f ′′ ( r ) Pick a sufficiently small interval I = [ r − δ , r + δ ] I = [r-\delta,r+\delta] I = [ r − δ , r + δ ] on which f ′ ≠ 0 f' \ne 0 f ′ = 0 . This is possible to do since f ′ f' f ′ is continuous and f ′ ( r ) ≠ 0 f'(r) \ne 0 f ′ ( r ) = 0 . Define
M = max x ∈ I ∣ f ′ ′ ( x ) ∣ min x ∈ I ∣ f ′ ( x ) ∣ M = \frac{\max_{x \in I} |f''(x)|}{\min_{x \in I} |f'(x)|} M = min x ∈ I ∣ f ′ ( x ) ∣ max x ∈ I ∣ f ′′ ( x ) ∣ We have
∣ r − x 1 ∣ ≤ M ∣ r − x 0 ∣ 2 ⟹ M ∣ r − x 1 ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 |r - x_1| \le M |r-x_0|^2 \qquad \Longrightarrow \qquad M |r - x_1| \le (M|r - x_0|)^2 ∣ r − x 1 ∣ ≤ M ∣ r − x 0 ∣ 2 ⟹ M ∣ r − x 1 ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 Choose x 0 x_0 x 0 such that
∣ r − x 0 ∣ ≤ δ and M ∣ r − x 0 ∣ < 1 |r - x_0| \le \delta \qquad\mbox{and}\qquad M|r - x_0| < 1 ∣ r − x 0 ∣ ≤ δ and M ∣ r − x 0 ∣ < 1 ⟹ M ∣ r − x 1 ∣ < 1 \Longrightarrow \qquad M|r - x_1| < 1 ⟹ M ∣ r − x 1 ∣ < 1 Moreover
∣ r − x 1 ∣ ≤ M ∣ r − x 0 ∣ ⏟ < 1 ⋅ ∣ r − x 0 ∣ < ∣ r − x 0 ∣ ≤ δ |r - x_1| \le \underbrace{M |r - x_0|}_{< 1} \cdot |r - x_0| < |r - x_0| \le \delta ∣ r − x 1 ∣ ≤ < 1 M ∣ r − x 0 ∣ ⋅ ∣ r − x 0 ∣ < ∣ r − x 0 ∣ ≤ δ Hence x 1 ∈ I x_1 \in I x 1 ∈ I . Applying the same argument inductively to x 2 , x 3 , … x_2, x_3, \ldots x 2 , x 3 , … we obtain
∀ n ≥ 0 { ∣ r − x n ∣ ≤ δ ⟹ x n ∈ I M ∣ r − x n ∣ < 1 \forall n \ge 0 \qquad \begin{cases}
|r - x_n| \le \delta \qquad \Longrightarrow \qquad x_n \in I \\
M|r - x_n| < 1
\end{cases} ∀ n ≥ 0 { ∣ r − x n ∣ ≤ δ ⟹ x n ∈ I M ∣ r − x n ∣ < 1 To show convergence, we write
∣ r − x n + 1 ∣ ≤ M ∣ r − x n ∣ 2 ⟹ M ∣ r − x n + 1 ∣ ≤ ( M ∣ r − x n ∣ ) 2 |r - x_{n+1}| \le M |r - x_n|^2 \qquad \Longrightarrow \qquad M|r - x_{n+1}| \le (M |r- x_n|)^2 ∣ r − x n + 1 ∣ ≤ M ∣ r − x n ∣ 2 ⟹ M ∣ r − x n + 1 ∣ ≤ ( M ∣ r − x n ∣ ) 2 and inductively
M ∣ r − x n ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 n M|r - x_n| \le (M |r - x_0|)^{2^n} M ∣ r − x n ∣ ≤ ( M ∣ r − x 0 ∣ ) 2 n Now taking limit
lim n → ∞ ∣ r − x n ∣ ≤ 1 M lim n → ∞ ( M ∣ r − x 0 ∣ ) 2 n = 0 \lim_{n \to \infty} |r - x_n| \le \frac{1}{M} \lim_{n \to \infty} (M |r - x_0|)^{2^n} = 0 n → ∞ lim ∣ r − x n ∣ ≤ M 1 n → ∞ lim ( M ∣ r − x 0 ∣ ) 2 n = 0 since M ∣ r − x 0 ∣ < 1 M|r-x_0| < 1 M ∣ r − x 0 ∣ < 1 , which proves the convergence. In the error formula
r − x n + 1 = − ( r − x n ) 2 f ′ ′ ( ξ n ) 2 f ′ ( x n ) r - x_{n+1} = - (r - x_n)^2 \frac{f''(\xi_n)}{2 f'(x_n)} r − x n + 1 = − ( r − x n ) 2 2 f ′ ( x n ) f ′′ ( ξ n ) ξ n \xi_n ξ n is between r r r and x n x_n x n and hence ξ n → r \xi_n \to r ξ n → r . Thus
lim n → ∞ r − x n + 1 ( r − x n ) 2 = − lim n → ∞ f ′ ′ ( ξ n ) 2 f ′ ( x n ) = − f ′ ′ ( r ) 2 f ′ ( r ) \lim_{n \to \infty} \frac{r - x_{n+1}}{(r - x_n)^2} = -\lim_{n \to \infty}
\frac{f''(\xi_n)}{2 f'(x_n)} = - \frac{f''(r)}{2f'(r)} n → ∞ lim ( r − x n ) 2 r − x n + 1 = − n → ∞ lim 2 f ′ ( x n ) f ′′ ( ξ n ) = − 2 f ′ ( r ) f ′′ ( r ) 2 Error estimate ¶ From mean value theorem
f ( x n ) = f ( x n ) − f ( r ) = f ′ ( ξ n ) ( x n − r ) f(x_n) = f(x_n) - f(r) = f'(\xi_n) (x_n - r) f ( x n ) = f ( x n ) − f ( r ) = f ′ ( ξ n ) ( x n − r ) where ξ n \xi_n ξ n is between x n x_n x n and r r r .
r − x n = − f ( x n ) f ′ ( ξ n ) r - x_n = - \frac{f(x_n)}{f'(\xi_n)} r − x n = − f ′ ( ξ n ) f ( x n ) If f ′ f' f ′ is not changing rapidly between x n x_n x n and r r r then
f ′ ( ξ n ) ≈ f ′ ( x n ) f'(\xi_n) \approx f'(x_n) f ′ ( ξ n ) ≈ f ′ ( x n ) We get the {\em error estimate}
r − x n ≈ − f ( x n ) f ′ ( x n ) = x n + 1 − x n r - x_n \approx - \frac{f(x_n)}{f'(x_n)} = x_{n+1} - x_n r − x n ≈ − f ′ ( x n ) f ( x n ) = x n + 1 − x n and the {\em relative error estimate}
r − x n r ≈ x n + 1 − x n x n + 1 \frac{r - x_n}{r} \approx \frac{x_{n+1} - x_n}{x_{n+1}} r r − x n ≈ x n + 1 x n + 1 − x n Hence we can use the test
∣ x n + 1 − x n ∣ < ϵ ∣ x n + 1 ∣ , 0 < ϵ ≪ 1 |x_{n+1} - x_n| < \epsilon |x_{n+1}|, \qquad 0 < \epsilon \ll 1 ∣ x n + 1 − x n ∣ < ϵ ∣ x n + 1 ∣ , 0 < ϵ ≪ 1 to decide on convergence. This is used in the code examples to stop the iterations.
Theorem 2 (Convex functions)
Assume that f ∈ C 2 ( R ) f \in \cts^2(\re) f ∈ C 2 ( R ) , is increasing, convex and has a zero. Then the zero is unique and the Newton method will converge to it starting from any initial point.
Let r r r be a zero of f f f . The function cannot have more than one zero since for x > r x > r x > r
f ( x ) = f ( r ) + ∫ r x f ′ ( s ) d s = ∫ r x f ′ ( s ) d s > 0 f(x) = f(r) + \int_r^x f'(s) \ud s = \int_r^x f'(s) \ud s > 0 f ( x ) = f ( r ) + ∫ r x f ′ ( s ) d s = ∫ r x f ′ ( s ) d s > 0 unless f ′ ≡ 0 f' \equiv 0 f ′ ≡ 0 .
We have f ′ > 0 f' >0 f ′ > 0 and f ′ ′ > 0 f'' > 0 f ′′ > 0 .
r − x n + 1 = − ( r − x n ) 2 f ′ ′ ( ξ n ) 2 f ′ ( x n ) ≤ 0 ⟹ x n + 1 ≥ r ∀ n ≥ 0 r - x_{n+1} = - (r - x_n)^2 \frac{f''(\xi_n)}{2 f'(x_n)} \le 0 \qquad \Longrightarrow
\qquad x_{n+1} \ge r \qquad \forall n \ge 0 r − x n + 1 = − ( r − x n ) 2 2 f ′ ( x n ) f ′′ ( ξ n ) ≤ 0 ⟹ x n + 1 ≥ r ∀ n ≥ 0 Also f ( x n ) > 0 f(x_n) > 0 f ( x n ) > 0 (WHY) for n = 1 , 2 , … n=1,2,\ldots n = 1 , 2 , … and hence
x n + 1 = x n − f ( x n ) f ′ ( x n ) < x n , n = 1 , 2 , … x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)} < x_n, \qquad n=1,2,\ldots x n + 1 = x n − f ′ ( x n ) f ( x n ) < x n , n = 1 , 2 , … { x n } n ≥ 1 \{x_n\}_{n\ge 1} { x n } n ≥ 1 is a decreasing sequence bounded below by r r r . Hence x ∗ = lim n → ∞ x n x^* =
\lim_{n \to \infty} x_n x ∗ = lim n → ∞ x n exists and
x ∗ = x ∗ − f ( x ∗ ) f ′ ( x ∗ ) ⟹ f ( x ∗ ) = 0 ⟹ x ∗ = r x^* = x^* - \frac{f(x^*)}{f'(x^*)} \qquad \Longrightarrow \qquad f(x^*) = 0 \qquad
\Longrightarrow \qquad x^* = r x ∗ = x ∗ − f ′ ( x ∗ ) f ( x ∗ ) ⟹ f ( x ∗ ) = 0 ⟹ x ∗ = r For a > 0 a > 0 a > 0 , the function
f ( x ) = x 2 − a f(x) = x^2 - a f ( x ) = x 2 − a has roots ± a \pm\sqrt{a} ± a . Newton’s method is
x k + 1 = 1 2 ( x k + a x k ) x_{k+1} = \half \left( x_k + \frac{a}{x_k} \right) x k + 1 = 2 1 ( x k + x k a ) which uses only basic arithmetic operations (but division is not very basic). Geometrically, it is easy to see that if x 0 > 0 x_0 > 0 x 0 > 0 , then we converge to the positive root and if x 0 < 0 x_0 < 0 x 0 < 0 , then we converge to the negative root.
from math import sqrt
a = 2.0
r = sqrt(a)
Define the function and its derivative.
def f(x):
return x**2 - a
def df(x):
return 2.0*x
Now implement Newton method as a function.
def newton(f,df,x0,M=100,eps=1.0e-15):
x = x0
for i in range(M):
dx = - f(x)/df(x)
x = x + dx
print('%3d %22.14f %22.14e %22.14e'%(i,x,x-r,abs(f(x))))
if abs(dx) < eps * abs(x):
return x
print('No convergence, current root = %e' % x)
We call Newton method with a complex initial guess for the root.
x0 = 2.0
x = newton(f,df,x0)
0 1.50000000000000 8.57864376269049e-02 2.50000000000000e-01
1 1.41666666666667 2.45310429357160e-03 6.94444444444464e-03
2 1.41421568627451 2.12390141474117e-06 6.00730488287127e-06
3 1.41421356237469 1.59472435257157e-12 4.51061410444709e-12
4 1.41421356237310 0.00000000000000e+00 4.44089209850063e-16
5 1.41421356237309 -2.22044604925031e-16 4.44089209850063e-16
From the last column, we observe that in each iteration, the number of correct decimal places doubles.
3 Implication of quadratic convergence ¶ What does quadratic convergence imply in practical terms ? For Newton method
e k + 1 ≤ M e k 2 e_{k+1} \le M e_k^2 e k + 1 ≤ M e k 2 Suppose M ≈ 1.0 M \approx 1.0 M ≈ 1.0 and e 0 = 1 0 − 1 e_0 = 10^{-1} e 0 = 1 0 − 1 . Then
e 1 ≈ 1 0 − 2 e 2 ≈ 1 0 − 4 e 3 ≈ 1 0 − 8 e 4 ≈ 1 0 − 16 (limit of double precision) \begin{align*}
& e_1 \approx 10^{-2} \\
& e_2 \approx 10^{-4} \\
& e_3 \approx 10^{-8} \\
& e_4 \approx 10^{-16} \qquad \textrm{(limit of double precision)}
\end{align*} e 1 ≈ 1 0 − 2 e 2 ≈ 1 0 − 4 e 3 ≈ 1 0 − 8 e 4 ≈ 1 0 − 16 (limit of double precision) If α = O ( 1 ) \alpha = O(1) α = O ( 1 ) , then
x 1 accurate to 2 digits x 2 accurate to 4 digits x 3 accurate to 8 digits x 4 accurate to 16 digits \begin{align*}
& x_1 \textrm{ accurate to 2 digits} \\
& x_2 \textrm{ accurate to 4 digits} \\
& x_3 \textrm{ accurate to 8 digits} \\
& x_4 \textrm{ accurate to 16 digits}
\end{align*} x 1 accurate to 2 digits x 2 accurate to 4 digits x 3 accurate to 8 digits x 4 accurate to 16 digits Newton’s method doubles the number of accurate figures in each iteration. Of course this happens only after the method enters the quadratic convergence regime. In the first few iterations, there may be no reduction of error or less than quadratic reduction.
\fbox{Show quadratic convergence in square root example}
Given a real number a > 0 a > 0 a > 0 , we want to find its reciprocal. Since division cannot be performed exactly on a computer, we would like to find a method which uses only addition and multiplication. We can cook up a function whose root is the reciprocal, e.g.,
f ( x ) = 1 x − a f(x) = \frac{1}{x} - a f ( x ) = x 1 − a Applying Newton method to this function yields the iteration, see
figure.
x k + 1 = 2 x k − a x k 2 x_{k+1} = 2 x_k - a x_k^2 x k + 1 = 2 x k − a x k 2 which involves only subtraction and multiplication. If we start with any 0 < x 0 ≤ 1 a 0 < x_0 \le \frac{1}{a} 0 < x 0 ≤ a 1 , then we converge to the root as we can verify geometrically. If we start with x 0 > 1 a x_0 > \frac{1}{a} x 0 > a 1 , but sufficiently close to it, then we again converge to the root. If a > 1 a > 1 a > 1 then x 1 ∈ ( 0 , 1 / a ] x_1 \in (0, 1/a] x 1 ∈ ( 0 , 1/ a ] iff x 0 ∈ [ 1 / a , 2 ) x_0 \in [1/a,2) x 0 ∈ [ 1/ a , 2 ) , and subsequent iterates will converge. If a > 1 a>1 a > 1 and x 0 > 2 x_0 > 2 x 0 > 2 , then x 1 < 0 x_1 < 0 x 1 < 0 after which we have divergence to − ∞ -\infty − ∞ .
If 0 < a < 1 0 < a < 1 0 < a < 1 , then any initial guess in the interval ( 0 , 1 / a ] ∪ [ 1 / a , 2 / a ) (0,1/a] \cup [1/a, 2/ a) ( 0 , 1/ a ] ∪ [ 1/ a , 2/ a ) will converge to the desired root. Note that x 0 = a x_0 = a x 0 = a is a valid initial guess in this case.
Let us try for a = 1 0 − 10 a = 10^{-10} a = 1 0 − 10 and with an initial guess x 0 = a x_0 = a x 0 = a .
a = 1.0e-10
def F(x):
return 1/x - a
def DF(x):
return -1/x**2
Now implement Newton method.
M = 100 # maximum iterations
x = a # initial guess
eps = 1e-15 # relative tolerance on root
f = F(x)
for i in range(M):
df = DF(x)
dx = -f/df
x = x + dx
e = abs(dx)
f = F(x)
print("%6d %22.14e %22.14e %22.14e" % (i,x,e,abs(f)))
if e < eps * abs(x):
break
0 2.00000000000000e-10 1.00000000000000e-10 5.00000000000000e+09
1 4.00000000000000e-10 2.00000000000000e-10 2.50000000000000e+09
2 8.00000000000000e-10 4.00000000000000e-10 1.25000000000000e+09
3 1.60000000000000e-09 8.00000000000000e-10 6.25000000000000e+08
4 3.20000000000000e-09 1.60000000000000e-09 3.12500000000000e+08
5 6.40000000000000e-09 3.20000000000000e-09 1.56250000000000e+08
6 1.28000000000000e-08 6.40000000000000e-09 7.81250000000000e+07
7 2.56000000000000e-08 1.28000000000000e-08 3.90625000000000e+07
8 5.12000000000000e-08 2.56000000000000e-08 1.95312500000000e+07
9 1.02400000000000e-07 5.12000000000000e-08 9.76562500000000e+06
10 2.04800000000000e-07 1.02400000000000e-07 4.88281250000000e+06
11 4.09600000000000e-07 2.04800000000000e-07 2.44140625000000e+06
12 8.19200000000000e-07 4.09600000000000e-07 1.22070312500000e+06
13 1.63840000000000e-06 8.19200000000000e-07 6.10351562500000e+05
14 3.27680000000000e-06 1.63840000000000e-06 3.05175781250000e+05
15 6.55360000000000e-06 3.27680000000000e-06 1.52587890625000e+05
16 1.31072000000000e-05 6.55360000000000e-06 7.62939453124999e+04
17 2.62144000000000e-05 1.31072000000000e-05 3.81469726562499e+04
18 5.24287999999999e-05 2.62143999999999e-05 1.90734863281249e+04
19 1.04857599999999e-04 5.24287999999996e-05 9.53674316406245e+03
20 2.09715199999998e-04 1.04857599999998e-04 4.76837158203120e+03
21 4.19430399999991e-04 2.09715199999993e-04 2.38418579101557e+03
22 8.38860799999965e-04 4.19430399999974e-04 1.19209289550776e+03
23 1.67772159999986e-03 8.38860799999895e-04 5.96046447753856e+02
24 3.35544319999944e-03 1.67772159999958e-03 2.98023223876903e+02
25 6.71088639999775e-03 3.35544319999831e-03 1.49011611938427e+02
26 1.34217727999910e-02 6.71088639999325e-03 7.45058059691883e+01
27 2.68435455999640e-02 1.34217727999730e-02 3.72529029845691e+01
28 5.36870911998559e-02 2.68435455998919e-02 1.86264514922596e+01
29 1.07374182399424e-01 5.36870911995677e-02 9.31322574610478e+00
30 2.14748364797694e-01 1.07374182398271e-01 4.65661287302739e+00
31 4.29496729590777e-01 2.14748364793083e-01 2.32830643648869e+00
32 8.58993459163107e-01 4.29496729572330e-01 1.16415321821935e+00
33 1.71798691825243e+00 8.58993459089320e-01 5.82076609084674e-01
34 3.43597383620971e+00 1.71798691795728e+00 2.91038304517337e-01
35 6.87194767123882e+00 3.43597383502911e+00 1.45519152233668e-01
36 1.37438953377553e+01 6.87194766651645e+00 7.27595760918342e-02
37 2.74877906566211e+01 1.37438953188658e+01 3.63797880209171e-02
38 5.49755812376843e+01 2.74877905810632e+01 1.81898939854586e-02
39 1.09951162173137e+02 5.49755809354528e+01 9.09494696772928e-03
40 2.19902323137349e+02 1.09951160964211e+02 4.54747345886464e-03
41 4.39804641438994e+02 2.19902318301645e+02 2.27373670443232e-03
42 8.79609263535176e+02 4.39804622096182e+02 1.13686832721616e-03
43 1.75921844969911e+03 8.79609186163930e+02 5.68434138608081e-04
44 3.51843658991326e+03 1.75921814021415e+03 2.84217044304043e-04
45 7.03687194188691e+03 3.51843535197365e+03 1.42108497152026e-04
46 1.40737389320171e+04 7.03686699013023e+03 7.10542235760217e-05
47 2.81474580570215e+04 1.40737191250044e+04 3.55270867880284e-05
48 5.62948368861036e+04 2.81473788290820e+04 1.77635183940494e-05
49 1.12589356861341e+05 5.62945199752375e+04 8.88173419709507e-06
50 2.25177446086354e+05 1.12588089225013e+05 4.44084209868827e-06
51 4.50349821684486e+05 2.25172375598132e+05 2.22039604962561e-06
52 9.00679361872783e+05 4.50329540188297e+05 1.11017302537576e-06
53 1.80127760141428e+06 9.00598239541493e+05 5.55061513813778e-07
54 3.60223074272882e+06 1.80095314131454e+06 2.77505759158689e-07
55 7.20316387882525e+06 3.60093313609643e+06 1.38727884082944e-07
56 1.44011392006640e+07 7.19797532183872e+06 6.93389510486708e-08
57 2.87815391203003e+07 1.43803999196363e+07 3.46444935387308e-08
58 5.74802405411872e+07 2.86987014208869e+07 1.72972827981375e-08
59 1.14630083277107e+08 5.71498427359199e+07 8.62371345646323e-09
60 2.27946160955002e+08 1.13316077677895e+08 4.28700084182336e-09
61 4.50696376680593e+08 2.22750215725590e+08 2.11878863851772e-09
62 8.81080030965884e+08 4.30383654285291e+08 1.03497067786652e-09
63 1.68452985983508e+09 8.03449828869199e+08 4.93637443801620e-10
64 3.08529563480257e+09 1.40076577496748e+09 2.24118048435897e-10
65 5.21868635419195e+09 2.13339071938939e+09 9.16191033777574e-11
66 7.71390398204098e+09 2.49521762784902e+09 2.96360445149611e-11
67 9.47737649966719e+09 1.76347251762621e+09 5.51443218860368e-12
68 9.97268646768999e+09 4.95309968022797e+08 2.73883395397065e-13
69 9.99992539709528e+09 2.72389294052880e+07 7.46034612872794e-16
70 9.99999999944344e+09 7.46023481644486e+04 5.56560720338062e-21
71 1.00000000000000e+10 5.56560720276110e-01 0.00000000000000e+00
72 1.00000000000000e+10 0.00000000000000e+00 0.00000000000000e+00
Example 6 (Slow convergence)
The quadratic convergence of Newton method is obtained only when we are sufficiently close to the root. It may happen that initially, the convergence is very slow. Consider the reciprocal example with a = 1 0 − 10 a = 10^{-10} a = 1 0 − 10 and let us start with x 0 = a x_0 = a x 0 = a for which the method converges but this initial guess is too far from the root. The first few iterates are
x 1 = 2 ⋅ 1 0 − 10 − 1 0 − 30 = 2 ⋅ 1 0 − 10 = 2 x 0 x 2 = 2 ⋅ 2 ⋅ 1 0 − 10 − 4 ⋅ 1 0 − 30 = 2 2 ⋅ 1 0 − 10 = 2 x 1 \begin{aligned}
x_1 &= 2 \cdot 10^{-10} - 10^{-30} = 2 \cdot 10^{-10} = 2 x_0 \\
x_2 &= 2 \cdot 2 \cdot 10^{-10} - 4 \cdot 10^{-30} = 2^2 \cdot 10^{-10} = 2 x_1
\end{aligned} x 1 x 2 = 2 ⋅ 1 0 − 10 − 1 0 − 30 = 2 ⋅ 1 0 − 10 = 2 x 0 = 2 ⋅ 2 ⋅ 1 0 − 10 − 4 ⋅ 1 0 − 30 = 2 2 ⋅ 1 0 − 10 = 2 x 1 Thus initially, the iterates double in each step and the change per step is very small. This slow change will continue until we get close to the root, i.e.,
2 k ⋅ 1 0 − 10 ≈ 1 0 10 2^k \cdot 10^{-10} \approx 10^{10} 2 k ⋅ 1 0 − 10 ≈ 1 0 10 which needs k ≈ 67 k \approx 67 k ≈ 67 iterations, after which we have rapid convergence.
4 Inflection point ¶ If we have a function with an inflection point, see figure, then the Newton iterates may cycle indefinitely without any convergence. The Newton method in step for is
Δ x k = − f ( x k ) f ′ ( x k ) , x k + 1 = x k + Δ x k \Delta x_k = - \frac{f(x_k)}{f'(x_k)}, \qquad x_{k+1} = x_k + \Delta x_k Δ x k = − f ′ ( x k ) f ( x k ) , x k + 1 = x k + Δ x k We can use ∣ f ( x ) ∣ |f(x)| ∣ f ( x ) ∣ as a metric to measure convergence towards the root: ∣ f ( x k ) ∣ |f(x_k)| ∣ f ( x k ) ∣ must decrease monotonically towards zero. We will accept the step Δ x k \Delta x_k Δ x k if ∣ f ( x k + 1 ) ∣ < ∣ f ( x k ) ∣ | f(x_{k+1})| < |f(x_k)| ∣ f ( x k + 1 ) ∣ < ∣ f ( x k ) ∣ , otherwise halve the step size Δ x k ← 1 2 Δ x k \Delta x_k \leftarrow \half \Delta x_k Δ x k ← 2 1 Δ x k . We can halve the step size repeatedly until reduction in ∣ f ( x ) ∣ |f(x)| ∣ f ( x ) ∣ is obtained; such a step size always exists. Moreover, to prevent wild oscillations of the iterates, the step size can also be restricted, e.g., dont allow step size to more than double in successive steps
∣ Δ x k ∣ ≤ 2 ∣ Δ x k − 1 ∣ |\Delta x_k| \le 2 |\Delta x_{k-1}| ∣Δ x k ∣ ≤ 2∣Δ x k − 1 ∣ Example 7 (Complex roots)
Newton method can converge to complex root if we start with a complex initial guess. The roots of
f ( x ) = x 5 + 1 f(x) = x^5 + 1 f ( x ) = x 5 + 1 lie on the unit circle in the complex plane. We can write
− 1 = exp ( i ( 2 n + 1 ) π ) , n = 0 , 1 , 2 , … -1 = \exp(\ii (2n+1)\pi), \qquad n=0,1,2,\ldots − 1 = exp ( i ( 2 n + 1 ) π ) , n = 0 , 1 , 2 , … The fifth roots are
( − 1 ) 1 / 5 = exp ( i ( 2 n + 1 ) π / 5 ) , n = 0 , 1 , 2 , 3 , 4 (-1)^{1/5} = \exp(\ii (2n+1)\pi/5), \qquad n=0,1,2,3,4 ( − 1 ) 1/5 = exp ( i ( 2 n + 1 ) π /5 ) , n = 0 , 1 , 2 , 3 , 4 Let us compute the exact roots and plot them on a graph.
n = array([0,1,2,3,4])
r = exp(1j*(2*n+1)*pi/5) # 5'th roots of -1
plot(real(r),imag(r),'o')
grid(True), axis('equal')
for x in r:
print(x,x**5)
(0.8090169943749475+0.5877852522924731j) (-1.0000000000000002+0j)
(-0.30901699437494734+0.9510565162951536j) (-1+4.440892098500626e-16j)
(-1+1.2246467991473532e-16j) (-1+6.123233995736766e-16j)
(-0.30901699437494756-0.9510565162951535j) (-1+7.216449660063518e-16j)
(0.8090169943749473-0.5877852522924734j) (-1.0000000000000004+1.1102230246251565e-15j)
Let us find a root using Newton method with starting guess 1 + i 1 + \ii 1 + i .
f = lambda x: x**5 + 1.0
df = lambda x: 5.0 * x**4
def newton(f,df,x0,M=100,eps=1.0e-15):
x = x0
for i in range(M):
dx = - f(x)/df(x)
x = x + dx
print(i,x,abs(f(x)))
if abs(dx) < eps * abs(x):
return x
x0 = 1.0 + 1.0j
x = newton(f,df,x0)
print('Root = ',x)
print('x**5 = ',x**5)
0 (0.85+0.8j) 1.4844926068747277
1 (0.7869450486819533+0.6530228729743794j) 0.3587868496977192
2 (0.8000098256925355+0.5887168663596131j) 0.04466991909204323
3 (0.8091271135998733+0.587660739881959j) 0.0008311338875735857
4 (0.8090169566882729+0.5877852118808937j) 2.762870121656608e-07
5 (0.8090169943749507+0.5877852522924782j) 3.014107274014632e-14
6 (0.8090169943749475+0.5877852522924731j) 2.220446049250313e-16
7 (0.8090169943749475+0.5877852522924731j) 2.220446049250313e-16
Root = (0.8090169943749475+0.5877852522924731j)
x**5 = (-1.0000000000000002+0j)
We do converge to a root, but which one will depend on the initial guess. Moreover, we can only get one root.
5 Implicit functions ¶ Suppose we have a function y = y ( x ) y=y(x) y = y ( x ) which is defined implicitly by
G ( x , y ) = 0 G(x,y) = 0 G ( x , y ) = 0 We may not be able find an explicit expression y ( x ) y(x) y ( x ) , instead we find the value of y y y for some discrete set of x x x values. This leads to a root finding problem for each x x x and we can apply Newton method.
Take a value of x x x and make an initial guess y 0 y_0 y 0 . This may not satisfy G ( x , y 0 ) = 0 G(x,y_0) = 0 G ( x , y 0 ) = 0 so we want to find a correction y 0 + Δ y 0 y_0 + \Delta y_0 y 0 + Δ y 0 such that G ( x , y 0 + Δ y 0 ) = 0 G(x,y_0+\Delta y_0) = 0 G ( x , y 0 + Δ y 0 ) = 0 . By Taylor expansion
G ( x , y 0 ) + G y ( x , y 0 ) Δ y 0 = 0 ⟹ Δ y 0 = − G ( x , y 0 ) G y ( x , y 0 ) G(x,y_0) + G_y(x,y_0) \Delta y_0 = 0 \quad\Longrightarrow\quad \Delta y_0 = -
\frac{G(x,y_0)}{G_y(x,y_0)} G ( x , y 0 ) + G y ( x , y 0 ) Δ y 0 = 0 ⟹ Δ y 0 = − G y ( x , y 0 ) G ( x , y 0 ) and hence we get an improved estimate
y 1 = y 0 + Δ y 0 = y 0 − G ( x , y 0 ) G y ( x , y 0 ) y_1 = y_0 + \Delta y_0 = y_0 - \frac{G(x,y_0)}{G_y(x,y_0)} y 1 = y 0 + Δ y 0 = y 0 − G y ( x , y 0 ) G ( x , y 0 ) This can be used to generate an iterative scheme
y k + 1 = y k − G ( x , y k ) G y ( x , y k ) , k = 0 , 1 , 2 , … y_{k+1} = y_k - \frac{G(x,y_k)}{G_y(x,y_k)}, \qquad k=0,1,2,\ldots y k + 1 = y k − G y ( x , y k ) G ( x , y k ) , k = 0 , 1 , 2 , … If we have to solve this problem for several values of x x x , then we can use the previous solution as an initial guess for the root, which can speed up the convergence of Newton method.
6 System of equations ¶ Consider a function f : R N → R N f : \re^N \to \re^N f : R N → R N such that for x = [ x 1 , x 2 , … , x N ] ⊤ x = [x_1, x_2, \ldots, x_N]^\top x = [ x 1 , x 2 , … , x N ] ⊤
f ( x ) = [ f 1 ( x ) , … , f N ( x ) ] ⊤ ∈ R N f(x) = [f_1(x), \ldots, f_N(x)]^\top \in \re^N f ( x ) = [ f 1 ( x ) , … , f N ( x ) ] ⊤ ∈ R N We want to find an α ∈ R N \alpha \in \re^N α ∈ R N such that f ( α ) = 0 f(\alpha) = 0 f ( α ) = 0 . This is a system of N N N non-linear equations for the N N N unknowns. We start with an initial guess x 0 ∈ R N x^0 \in \re^N x 0 ∈ R N and in general f ( x 0 ) ≠ 0 f(x^0) \ne 0 f ( x 0 ) = 0 . We will try to find a correction Δ x 0 \Delta x^0 Δ x 0 to x 0 x^0 x 0 such that
f ( x 0 + Δ x 0 ) = 0 f(x^0 + \Delta x^0) = 0 f ( x 0 + Δ x 0 ) = 0 Approximating by the first two terms of the Taylor expansion
f ( x 0 ) + f ′ ( x 0 ) Δ x 0 = 0 f(x^0) + f'(x^0) \Delta x^0 = 0 f ( x 0 ) + f ′ ( x 0 ) Δ x 0 = 0 we can find the correction by solving the matrix problem
f ′ ( x 0 ) Δ x 0 = − f ( x 0 ) f'(x^0) \Delta x^0 = - f(x^0) f ′ ( x 0 ) Δ x 0 = − f ( x 0 ) where f ′ ∈ R N × N f' \in \re^{N \times N} f ′ ∈ R N × N is given by
[ f ′ ] i j = ∂ f i ∂ x j [f']_{ij} = \df{f_i}{x_j} [ f ′ ] ij = ∂ x j ∂ f i Once we solve for Δ x 0 = − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) \Delta x^0 = -[f'(x^0)]^{-1} f(x^0) Δ x 0 = − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) , we get a new estimate of the root
x 1 = x 0 + Δ x 0 = x 0 − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) x^1 = x^0 + \Delta x^0 = x^0 - [f'(x^0)]^{-1} f(x^0) x 1 = x 0 + Δ x 0 = x 0 − [ f ′ ( x 0 ) ] − 1 f ( x 0 ) The above steps are applied iteratively. Choose an initial guess x 0 x^0 x 0 and set k = 0 k=0 k = 0 .
Solve the matrix problem
f ′ ( x k ) Δ x k = − f ( x k )
f'(x^k) \Delta x^k = - f(x^k) f ′ ( x k ) Δ x k = − f ( x k ) Update the root
x k + 1 = x k + Δ x k
x^{k+1} = x^k + \Delta x^k x k + 1 = x k + Δ x k k = k + 1 k=k+1 k = k + 1
Example 8 (Newton method for system)
For x = ( x 0 , x 1 , x 2 ) ∈ R 3 x = (x_0,x_1,x_2) \in \re^3 x = ( x 0 , x 1 , x 2 ) ∈ R 3 , define f : R 3 → R 3 f : \re^3 \to \re^3 f : R 3 → R 3 by
f ( x ) = [ x 0 x 1 − x 2 2 − 1 x 0 x 1 x 2 − x 0 2 + x 1 2 − 2 exp ( x 0 ) − exp ( x 1 ) + x 2 − 3 ] f(x) = \begin{bmatrix}
x_0 x_1 - x_2^2 - 1 \\
x_0 x_1 x_2 - x_0^2 + x_1^2 - 2 \\
\exp(x_0) - \exp(x_1) + x_2 - 3
\end{bmatrix} f ( x ) = ⎣ ⎡ x 0 x 1 − x 2 2 − 1 x 0 x 1 x 2 − x 0 2 + x 1 2 − 2 exp ( x 0 ) − exp ( x 1 ) + x 2 − 3 ⎦ ⎤ The gradient of f f f is given by
f ′ ( x ) = [ x 1 x 0 − 2 x 2 x 1 x 2 − 2 x 0 x 0 x 2 + 2 x 1 x 0 x 1 exp ( x 0 ) − exp ( x 1 ) 1 ] f'(x) = \begin{bmatrix}
x_1 & x_0 & -2 x_2 \\
x_1 x_2 - 2 x_0 & x_0 x_2 + 2 x_1 & x_0 x_1 \\
\exp(x_0) & -\exp(x_1) & 1
\end{bmatrix} f ′ ( x ) = ⎣ ⎡ x 1 x 1 x 2 − 2 x 0 exp ( x 0 ) x 0 x 0 x 2 + 2 x 1 − exp ( x 1 ) − 2 x 2 x 0 x 1 1 ⎦ ⎤ We implement functions for these.
def f(x):
y = zeros(3)
y[0] = x[0]*x[1] - x[2]**2 - 1.0
y[1] = x[0]*x[1]*x[2] - x[0]**2 + x[1]**2 - 2.0
y[2] = exp(x[0]) - exp(x[1]) + x[2] - 3.0
return y
def df(x):
y = array([[x[1], x[0], -2.0*x[2]],
[x[1]*x[2]-2.0*x[0], x[0]*x[2]+2.0*x[1], x[0]*x[1]],
[exp(x[0]), -exp(x[1]), 1.0]])
return y
The Newton method uses matrix solver from Numpy.
def newton(fun,dfun,x0,M=100,eps=1.0e-14,debug=False):
x = x0
for i in range(M):
g = fun(x)
J = dfun(x)
h = solve(J,-g)
x = x + h
if debug:
print(i,x,norm(g))
if norm(h) < eps * norm(x):
return x
Now we solve
x0 = array([1.0,1.0,1.0])
x = newton(f,df,x0,debug=True)
print('Root = ',x)
print('f(x) = ',f(x))
0 [2.1893261 1.59847516 1.39390063] 2.449489742783178
1 [1.85058965 1.44425142 1.278224 ] 2.5243835363236373
2 [1.7801612 1.42443598 1.23929244] 0.4123361696320246
3 [1.77767471 1.42396093 1.23747382] 0.014812983951911496
4 [1.77767192 1.4239606 1.23747112] 1.8310659761129026e-05
5 [1.77767192 1.4239606 1.23747112] 2.4379428075383574e-11
6 [1.77767192 1.4239606 1.23747112] 6.661338147750939e-16
Root = [1.77767192 1.4239606 1.23747112]
f(x) = [ 2.22044605e-16 -4.44089210e-16 4.44089210e-16]