Skip to article frontmatterSkip to article content

4Homotopy and continuation methods

#%config InlineBackend.figure_format = 'svg'
from pylab import *
from scipy.integrate import odeint
from scipy.linalg import solve, norm

The material of this Chapter is from Kincaid & Cheney, 2002.

4.1Main idea

Let f:XYf : X \to Y and consider the problem: find xXx \in X such that f(x)=0f(x) = 0. This problem may be difficult to solve; for Newton method, we need a good initial guess, which we may not know.

Suppose we can easily solve the problem: find xXx \in X such that g(x)=0g(x) = 0. Define

h(t,x)=tf(x)+(1t)g(x)h(t,x) = t f(x) + (1-t) g(x)

Note that

h(0,x)=g(x),h(1,x)=f(x)h(0,x) = g(x), \qquad h(1,x) = f(x)

Partition the interval [0,1][0,1] into

0=t0<t1<t2<<tm=10 = t_0 < t_1 < t_2 < \ldots < t_m = 1

We can easily solve the problem h(t0,x)=g(x)=0h(t_0,x) = g(x) = 0, let x0x_0 be the solution. Now to solve the problem h(t1,x)=0h(t_1,x)=0, say using Newton method, we can use x0x_0 as the initial guess. Since the two problems are close, x0x_0 can be expected to be a good initial guess for the next problem. This process can be continued until we solve the problem h(tm,x)=f(x)=0h(t_m,x) = f(x) = 0 using the initial guess xm1x_{m-1}.

The simplest example is the linear interpolation between the two problems, which we discussed above.

4.2An ODE

Instead of solving the sequence of problems, we will convert it into an ODE problem. The solution of h(t,x)=0h(t,x)=0 depends on the parameter tt, so x(t)x(t) satisfies

h(t,x(t))=0h(t, x(t)) = 0

Differentiate wrt tt

ht(t,x(t))+hx(t,x(t))x(t)=0h_t (t, x(t)) + h_x(t,x(t)) x'(t) = 0

so that

x(t)=[hx(t,x(t))]1ht(t,x(t))=:R(t,x(t))x'(t) = - [h_x(t,x(t))]^{-1} h_t(t,x(t)) =: R(t, x(t))

For this ODE to be valid, we need hh to be differentiable wrt tt and xx and hxh_x must be non-singular. Suppose we know that x0x_0 solves h(0,x)=0h(0,x)=0. We solve the ODE approximately upto t=1t=1 with initial condition x(0)=x0x(0) = x_0.

4.3Relation to Newton method

Choose an x0x_0 and consider the homotopy

h(t,x)=f(x)etf(x0)h(t,x) = f(x) - \ee^{-t} f(x_0)

so that

h(0,x)=f(x)f(x0),h(,x)=f(x)h(0,x) = f(x) - f(x_0), \qquad h(\infty,x) = f(x)

The curve x(t)x(t) such that

h(t,x(t))=0=f(x(t))etf(x0)h(t,x(t)) = 0 = f(x(t)) - \ee^{-t} f(x_0)

satisfies the ODE

0=f(x(t))x(t)+etf(x0)=f(x(t))x(t)+f(x(t))0 = f'(x(t)) x'(t) + \ee^{-t} f(x_0) = f'(x(t)) x'(t) + f(x(t))

so that

x(t)=[f(x(t))]1f(x(t))x'(t) = -[f'(x(t))]^{-1} f(x(t))

Applying forward Euler scheme with Δt=1\Delta t = 1 we get

xn+1=xn[f(xn)]1f(xn)x_{n+1} = x_n - [f'(x_n)]^{-1} f(x_n)

we get the Newton-Raphson method.

4.4Linear programming

Let c,xRnc,x \in \re^n, ARm×nA \in \re^{m \times n}, m<nm < n and consider the constrained optimization problem

{maxxRncxsubject to Ax=b,x0\begin{cases} \max\limits_{x \in \re^n} c^\top x \\ \textrm{subject to } A x = b, \qquad x \ge 0 \end{cases}

Choose a starting point x0x^0 which satisfies the constraints, i.e.

x0F={xRn:Ax=b,x0}x^0 \in \mathcal{F} = \{ x \in \re^n : A x = b, \quad x \ge 0 \}

Our goal is to find x1x^1 such that cx1>cx0c^\top x^1 > c^\top x^0 and x1Fx^1 \in \mathcal{F}. Equivalently, find a curve x(t)x(t) such that

  1. x(0)=x0x(0) = x^0

  2. x(t)0x(t) \ge 0 t0\forall t \ge 0

  3. Ax(t)=bA x(t) = b

  4. cx(t)c^\top x(t) is increasing function of tt, t0t \ge 0

Let us try to find an equation satisfied by such a curve x(t)x(t)

x(t)=f(x),x(0)=x0x'(t) = f(x), \qquad x(0) = x^0

To satisfy (2), choose ff such that fi(x)0f_i(x) \to 0 as xi0x_i \to 0. One such choice is

D(x)=diag[x1,x2,,xn]D(x) = \diag{[x_1, x_2, \ldots, x_n]}
f(x)=D(x)G(x),xi=xiGi(x)f(x) = D(x) G(x), \qquad x_i' = x_i G_i(x)

To satisfy (3) we must have Ax(t)=0A x'(t) = 0 or AD(x)G(x)=0A D(x) G(x) = 0. Choose

G(x)=P(x)H(x)G(x) = P(x) H(x)

where P:RnP : \re^n \to null space of AD(x)AD(x). So we can take

P(x)=orthogonal projection of x into the null space of AD(x)P(x) = \textrm{orthogonal projection of $x$ into the null space of $AD(x)$}

The null space of AD(x)AD(x) is a subspace of Rn\re^n. By Projection Theorem in Hilbert spaces, we have

vRn,Pvnull space of AD(x)\forall v \in \re^n, \qquad Pv \in \textrm{null space of } AD(x)
 iff (vPv,w)=0wnull space of AD(x)\textrm{ iff } \ip{v-Pv,w} = 0 \quad \forall w \in \textrm{null space of } AD(x)

In particular

(vPv,Pv)=0,vRn\ip{v - Pv, Pv} = 0, \qquad \forall v \in \re^n

Such a PP is given by

P=I(AD)[(AD)(AD)]1(AD)P = I - (AD)^\top [ (AD)(AD)^\top]^{-1} (AD)

The existence of PP requires that (AD)(AD)=(m×n)(n×m)=(m×m)(AD)(AD)^\top = (m\times n)(n\times m) = (m \times m) is invertible, and hence rank AD=mAD = m. This is true if each xi>0x_i > 0 and rank A=mA=m.

Now (4) will be satisfied by the solution of the ODE if

0ddtcx(t)=cf(x)=cDP(x)H(x)=(Dc)P(x)H(x)\begin{aligned} 0 \le& \dd{}{t} c^\top x(t) \\ =& c^\top f(x) \\ =& c^\top D P(x) H(x) \\ =& (Dc)^\top P(x) H(x) \end{aligned}

Let us choose H(x)=D(x)cH(x) = D(x) c, then

(Dc)P(Dc)=vPv,v=Dc=(v,Pv)=(vPv+Pv,Pv)=(vPv,Pv)+(Pv,Pv)=(Pv,Pv)0\begin{aligned} (Dc)^\top P (Dc) =& v^\top P v, \qquad v = Dc \\ =& \ip{v, Pv} \\ =& \ip{v-Pv+Pv,Pv} \\ =& \ip{v-Pv,Pv} + \ip{Pv,Pv} \\ =& \ip{Pv,Pv} \\ \ge& 0 \end{aligned}

We have constructed an ODE model whose solution has all the required properties. How do we compute PvPv ? Using the definition of PP requires computing an inverse matrix which can be expensive if the sizes involved are large. Instead, define B=ADB=AD and given vRnv \in \re^n, compute PvPv by the following steps.

  1. Solve for zz

    BBz=BvB B^\top z = Bv
  2. Pv=vBzPv = v - B^\top z

Now we are in a position to solve the ODE

x(t)=D(x)P(x)D(x)c,x(0)=x0Fx'(t) = D(x) P(x) D(x) c, \qquad x(0) = x^0 \in \mathcal{F}

The simplest approach is the Euler method which finds xk+1x(tk+Δtk)x^{k+1} \approx x(t_k+\Delta t_k) from

xk+1xkΔtk=f(xk),k=0,1,2,\frac{x^{k+1} - x^k}{\Delta t_k} = f(x^k), \qquad k=0,1,2,\ldots
xk+1=xk+Δtkf(xk)x^{k+1} = x^k + \Delta t_k f(x^k)

It is easy to verify that Axk+1=bAx^{k+1}=b since Af(xk)=0Af(x^k) = 0. It remains to ensure that xk+1>0x^{k+1} > 0 which may not be automatically satisfied by the Euler method. We can satisfy this condition by choosing Δtk\Delta t_k small enough,

xik+1=xik+Δtkfi(xk)0x_i^{k+1} = x_i^k + \Delta t_k f_i(x^k) \ge 0

e.g.,

Δtk=910mini[xikfi(xk)]\Delta t_k = \frac{9}{10} \min_{i} \left[ \frac{-x_i^k}{f_i(x^k)} \right]

The factor 9/109/10 ensures the strict inequality xk+1>0x^{k+1} > 0.

References
  1. Kincaid, D., & Cheney, W. (2002). Numerical Analysis: Mathematics of Scientific Computing (3rd ed.). American Mathematics Society.