Skip to article frontmatterSkip to article content

3.5Secant Method

#%config InlineBackend.figure_format = 'svg'
from pylab import *

3.5.1Quasi-Newton methods

Newton method requires the knowledge of the gradient. In many situations, it may not be possible to find the gradient. E.g., if f(x)f(x) is available only as a computer code, then it may not be possible to compute the gradient[1]. In such situations we can try to approximate the gradient and methods which make use of approximate derivatives are called quasi-Newton methods. For a scalar function f(x)f(x), we can approximate the derivative using a finite difference approximation, e.g.,

f(xk)f(xk+h)f(xk)h,0h1f'(x_k) \approx \frac{f(x_k+h) - f(x_k)}{h}, \qquad 0 \le h \ll 1

Since we want to generate a sequence of approximations to the root, we can use two successive values of xkx_k to estimate the derivative

f(xk)f(xk)f(xk1)xkxk1f'(x_k) \approx \frac{f(x_k) - f(x_{k-1})}{x_k - x_{k-1}}

Using this derivative in place of the exact derivative in Newton method leads to the secant method.

3.5.2Secant method

Given two starting values x0x_0, x1x_1 which are sufficiently close to the root rr, we construct a straight line approximation to f(x)f(x) passing through the points (x0,f(x0))(x_0,f(x_0)) and (x1,f(x1))(x_1, f(x_1)). We find the zero of this straight line which will form the next approximation to the root x2x_2 and is given by

x2=x1f(x1)x1x0f(x1)f(x0)x_2 = x_1 - f(x_1) \frac{x_1 - x_0}{f(x_1) - f(x_0)}

Inductively, we obtain

xk+1=xkf(xk)xkxk1f(xk)f(xk1),k=1,2,x_{k+1} = x_k - f(x_k) \frac{x_k - x_{k-1}}{f(x_k) - f(x_{k-1})}, \qquad k=1,2,\ldots

Note that this is just Newton iteration formulae where the derivative has been approximated by the finite difference formula.

3.5.3Convergence analysis

From the iteration formula, we get by subtracting rr on both sides

rxk+1=rxk+f(xk)xkxk1f(xk)f(xk1)r - x_{k+1} = r - x_k + f(x_k) \frac{x_k - x_{k-1}}{f(x_k) - f(x_{k-1})}

Define the Newton divided differences

f[xk1,xk]:=f(xk)f(xk1)xkxk1,f[xk1,xk,r]:=f[xk,r]f[xk1,xk]rxk1f[x_{k-1},x_k] := \frac{f(x_k) - f(x_{k-1})}{x_k - x_{k-1}}, \qquad f[x_{k-1},x_k, r]:= \frac{f[x_k,r] - f[x_{k-1},x_k]}{r - x_{k-1}}

then it is easy to show that

f[xk1,xk]=f(yk),for some yk between xk1,xkf[x_{k-1},x_k] = f'(y_k), \qquad \textrm{for some $y_k$ between $x_{k-1}, x_k$}

and

f[xk1,xk,r]=12f(zk)for some zk between xk1,xk,rf[x_{k-1},x_k,r] = \half f''(z_k) \qquad \textrm{for some $z_k$ between $x_{k-1},x_k,r$}

Hence we get the error formula

rxk+1=(rxk1)(rxk)f[xk1,xk,r]2f[xk1,xk]=(rxk1)(rxk)f(zk)2f(yk)\begin{align} r - x_{k+1} &= -(r - x_{k-1})(r - x_k) \frac{f[x_{k-1},x_k,r]} {2f[x_{k-1},x_k]} \\ &= -(r - x_{k-1})(r - x_k)\frac{f''(z_k)}{2f'(y_k)} \end{align}

If we can bound the last factor

f(zk)2f(yk)M<\left| \frac{f''(z_k)}{2f'(y_k)} \right| \le M < \infty

then we get a bound on the error ek=rxke_k = |r - x_k|

ek+1Mek1eke_{k+1} \le M e_{k-1} e_k

Recall that Newton method converges with order p=2p=2, so we can expect secant method to be bit slower than Newton.

3.5.4Error estimate

We can use the function value to decide on convergence but this may not be a good criterion if the function is very flat around the root. Additionally we can estimate the error in the root also. From mean value theorem

f(xk)=f(xk)f(r)=f(ξk)(xkr)f(x_k) = f(x_k) - f(r) = f'(\xi_k) (x_k - r)

where ξk\xi_k is between xkx_k and rr.

rxk=f(xk)f(ξk)r - x_k = - \frac{f(x_k)}{f'(\xi_k)}

If ff' is not changing rapidly between xkx_k and rr then

f(ξk)f(xk+1)f(xk)xk+1xkf'(\xi_k) \approx \frac{f(x_{k+1}) - f(x_k)}{x_{k+1} - x_k}

We get the error estimate

rxkf(xk)f(xk+1)f(xk)xk+1xk=xk+1xkr - x_k \approx - \frac{f(x_k)}{\frac{f(x_{k+1}) - f(x_k)}{x_{k+1} - x_k}} = x_{k+1} - x_k

and the relative error estimate

rxkrxk+1xkxk+1\frac{r - x_k}{r} \approx \frac{x_{k+1} - x_k}{x_{k+1}}

Hence we can use the test

xk+1xk<ϵxk+1,0<ϵ1|x_{k+1} - x_k| < \epsilon |x_{k+1}|, \qquad 0 < \epsilon \ll 1

to decide on convergence. This is used in the code examples to stop the iterations.

3.5.5Accuracy of FD approximation

The secant method can be written as

xk+1=xkf(xk)gk,gk=f(xk)f(xk1)xkxk1x_{k+1} = x_k - \frac{f(x_k)}{g_k}, \qquad g_k = \frac{f(x_k) - f(x_{k-1})}{x_k - x_{k-1}}

As xkrx_k \to r, the quantities xkxk1x_k - x_{k-1} and f(xk)f(xk1)f(x_k) - f(x_{k-1}) can suffer from round-off error and gkg_k can then be inaccurate. Let us analyze this using Taylor expansion for the finite difference approximation

f(x+h)f(x)h=f(x)+12hf(ξ),ξ between x and x+h\frac{f(x+h) - f(x)}{h} = f'(x) + \half h f''(\xi), \qquad \textrm{$\xi$ between $x$ and $x+h$}

We expect that as h0h \to 0, we get a more accurate approximation of the derivative. However, this assumes that we can evaluate f(x)f(x) exactly and the arithmetic is exact, both of which are not true in a computer. What we compute on the computer is actually (there is also roundoff error in subtraction which is of similar order as unit round)

f^(x+h)f^(x)h\frac{\hat{f}(x+h) - \hat{f}(x)}{h}

where, assuming f(x)=O(1)f(x) = O(1)

f^(x+h)=f(x+h)+ϵ2,f^(x)=f(x)+ϵ1\hat{f}(x+h) = f(x+h) + \epsilon_2, \quad \hat{f}(x) = f(x) + \epsilon_1

and

ϵ1,ϵ2Cu,the unit round of the computer\epsilon_1, \epsilon_2 \approx C \uround, \quad \textrm{the unit round of the computer}

Hence

f^(x+h)f^(x)h=f(x+h)f(x)h+Cuh=f(x)+12hf(ξ)+Cuh\begin{aligned} \frac{\hat{f}(x+h) - \hat{f}(x)}{h} &=& \frac{f(x+h) - f(x)}{h} + \frac{C\uround}{h} \\ &=& f'(x) + \half h f''(\xi) + \frac{C\uround}{h} \end{aligned}

The error consists of

  1. discretization error: 12hf(ξ)\half h f''(\xi)
  2. round-off error: Cuh\frac{C\uround}{h}

As h0h \to 0, the discretization error decreases but the round-off error increases. There is an optimum value of hh for which the total error is minimized, given by

hopt=(2Cuf)12h_{opt} = \left( \frac{2C \uround}{f''} \right)^\half

For the above finite difference approximation, the optimum step size h=O(u)=O(107)h = O(\sqrt{\uround}) = \order{10^{-7}}. Based on this analysis, we can modify the secant method as follows. If

xkxk1>xk1+xk2u|x_k - x_{k-1}| > \left| \frac{x_{k-1} + x_k}{2} \right| \sqrt{\uround}

use the gkg_k as given above. Otherwise, use the finite difference approximation

gk=f(xk+h)f(xk)h,h=xkug_k = \frac{f(x_k + h) - f(x_k)}{h}, \qquad h = |x_k| \sqrt{\uround}

3.5.6Complex step method

We can take a complex step instead of a real step. Taylor expansion with a small complex step yields

f(x+ih)=f(x)+ihf(x)12h2f(x)+O(ih3)f(x+\ii h) = f(x) + \ii h f'(x) - \half h^2 f''(x) + O(\ii h^3)

Hence

imagf(x+ih)=hf(x)+O(h3)\imag f(x+\ii h) = h f'(x) + O(h^3)

and

imagf(x+ih)h=f(x)+O(h2)\frac{\imag f(x+\ii h)}{h} = f'(x) + O(h^2)

is a second order accurate approximation to the derivative. This formula does not involve subtraction of near equal quantities and hence it does not suffer from round-off error even if we take hh to be very small. In fact it is perfectly fine to take hh smaller than machine precision, e.g., h=1020h = 10^{-20}.

Footnotes
  1. There are methods to differentiate a computer program called Automatic Differentiation.