Skip to article frontmatterSkip to article content

Homotopy and continuation methods

from pylab import *
from scipy.integrate import odeint
from scipy.linalg import solve, norm

The material of this Chapter is from Kincaid & Cheney, 2002.

Let f:XYf : X \to Y and consider the problem: find xXx \in X such that f(x)=0f(x) = 0. This problem may be difficult to solve. For Newton method, we need a good initial guess, which we may not know.

1An ODE

The solution of h(t,x)=0h(t,x)=0 depends on the parameter tt, so x(t)x(t) satisfies

h(t,x(t))=0h(t, x(t)) = 0

Differentiate wrt tt

ht(t,x(t))+hx(t,x(t))x(t)=0h_t (t, x(t)) + h_x(t,x(t)) x'(t) = 0

so that

x(t)=[hx(t,x(t))]1ht(t,x(t))x'(t) = - [h_x(t,x(t))]^{-1} h_t(t,x(t))

For this ODE to be valid, we need hh to be differentiable wrt tt and xx and hxh_x must be non-singular. Suppose we know that x0x_0 solves h(0,x)=0h(0,x)=0. We solve the ODE upto t=1t=1 with initial condition x(0)=x0x(0) = x_0.

2Relation to Newton method

Consider the homotopy

h(t,x)=f(x)etf(x0)h(t,x) = f(x) - \ee^{-t} f(x_0)

so that

h(0,x)=f(x)f(x0),h(,x)=f(x)h(0,x) = f(x) - f(x_0), \qquad h(\infty,x) = f(x)

The curve x(t)x(t) such that

h(t,x(t))=0=f(x(t))etf(x0)h(t,x(t)) = 0 = f(x(t)) - \ee^{-t} f(x_0)

satisfies the ODE

0=f(x(t))x(t)+etf(x0)=f(x(t))x(t)+f(x(t))0 = f'(x(t)) x'(t) + \ee^{-t} f(x_0) = f'(x(t)) x'(t) + f(x(t))

so that

x(t)=[f(x(t))]1f(x(t))x'(t) = -[f'(x(t))]^{-1} f(x(t))

Applying forward Euler scheme with Δt=1\Delta t = 1 we get

xn+1=xn[f(xn)]1f(xn)x_{n+1} = x_n - [f'(x_n)]^{-1} f(x_n)

we get the Newton-Raphson method.

3Linear programming

Let c,xRnc,x \in \re^n, ARm×nA \in \re^{m \times n}, m<nm < n and consider the constrained optimization problem

{maxxRncxsubject to Ax=b,x0\begin{cases} \max\limits_{x \in \re^n} c^\top x \\ \textrm{subject to } A x = b, \qquad x \ge 0 \end{cases}

Choose a starting point x0x^0 which satisfies the constraints, i.e.

x0F={xRn:Ax=b,x0}x^0 \in \mathcal{F} = \{ x \in \re^n : A x = b, \quad x \ge 0 \}

Our goal is to find x1x^1 such that cx1>cx0c^\top x^1 > c^\top x^0 and x1Fx^1 \in \mathcal{F}. Equivalently, find a curve x(t)x(t) such that

  1. x(0)=x0x(0) = x^0

  2. x(t)0x(t) \ge 0 t0\forall t \ge 0

  3. Ax(t)=bA x(t) = b

  4. cx(t)c^\top x(t) is increasing function of tt, t0t \ge 0

Let us try to find an equation satisfied by such a curve x(t)x(t)

x(t)=f(x),x(0)=x0x'(t) = f(x), \qquad x(0) = x^0

To satisfy (2), choose ff such that fi(x)0f_i(x) \to 0 as xi0x_i \to 0. One such choice is

D(x)=diag[x1,x2,,xn]D(x) = \diag{[x_1, x_2, \ldots, x_n]}
f(x)=D(x)G(x),xi=xiGi(x)f(x) = D(x) G(x), \qquad x_i' = x_i G_i(x)

To satisfy (3) we must have Ax(t)=0A x'(t) = 0 or AD(x)G(x)=0A D(x) G(x) = 0. Choose

G(x)=P(x)H(x)G(x) = P(x) H(x)

where P:RnP : \re^n \to null space of AD(x)AD(x). So we can take

P(x)=orthogonal projection of x into the null space of AD(x)P(x) = \textrm{orthogonal projection of $x$ into the null space of $AD(x)$}

The null space of AD(x)AD(x) is a subspace of Rn\re^n. By Projection Theorem in Hilbert spaces, we have

vRn,Pvnull space of AD(x)\forall v \in \re^n, \qquad Pv \in \textrm{null space of } AD(x)
 iff (vPv,w)=0wnull space of AD(x)\textrm{ iff } \ip{v-Pv,w} = 0 \quad \forall w \in \textrm{null space of } AD(x)

In particular

(vPv,Pv)=0,vRn\ip{v - Pv, Pv} = 0, \qquad \forall v \in \re^n

Such a PP is given by

P=I(AD)[(AD)(AD)]1(AD)P = I - (AD)^\top [ (AD)(AD)^\top]^{-1} (AD)

The existence of PP requires that (AD)(AD)=(m×n)(n×m)=(m×m)(AD)(AD)^\top = (m\times n)(n\times m) = (m \times m) is invertible, and hence rank AD=mAD = m. This is true if each xi>0x_i > 0 and rank A=mA=m.

Now (4) will be satisfied by the solution of the ODE if

0ddtcx(t)=cf(x)=cDP(x)H(x)=(Dc)P(x)H(x)\begin{aligned} 0 \le& \dd{}{t} c^\top x(t) \\ =& c^\top f(x) \\ =& c^\top D P(x) H(x) \\ =& (Dc)^\top P(x) H(x) \end{aligned}

Let us choose H(x)=D(x)cH(x) = D(x) c, then

(Dc)P(Dc)=vPv,v=Dc=(v,Pv)=(vPv+Pv,Pv)=(vPv,Pv)+(Pv,Pv)=(Pv,Pv)0\begin{aligned} (Dc)^\top P (Dc) =& v^\top P v, \qquad v = Dc \\ =& \ip{v, Pv} \\ =& \ip{v-Pv+Pv,Pv} \\ =& \ip{v-Pv,Pv} + \ip{Pv,Pv} \\ =& \ip{Pv,Pv} \\ \ge& 0 \end{aligned}

We have constructed an ODE model whose solution has all the required properties. How do we compute PvPv ? Using the definition of PP requires computing an inverse matrix which can be expensive if the sizes involved are large. Instead, define B=ADB=AD and given vRnv \in \re^n, compute PvPv by the following steps.

  1. Solve for zz

    BBz=Bv B B^\top z = Bv
  2. Pv=vBzPv = v - B^\top z

Now we are in a position to solve the ODE

x(t)=D(x)P(x)D(x)c,x(0)=x0Fx'(t) = D(x) P(x) D(x) c, \qquad x(0) = x^0 \in \mathcal{F}

The simplest approach is the Euler method which finds xk+1x(tk+Δtk)x^{k+1} \approx x(t_k+\Delta t_k) from

xk+1xkΔtk=f(xk),k=0,1,2,\frac{x^{k+1} - x^k}{\Delta t_k} = f(x^k), \qquad k=0,1,2,\ldots
xk+1=xk+Δtkf(xk)x^{k+1} = x^k + \Delta t_k f(x^k)

It is easy to verify that Axk+1=bAx^{k+1}=b since Af(xk)=0Af(x^k) = 0. It remains to ensure that xk+1>0x^{k+1} > 0 which may not be automatically satisfied by the Euler method. We can satisfy this condition by choosing Δtk\Delta t_k small enough,

xik+1=xik+Δtkfi(xk)0x_i^{k+1} = x_i^k + \Delta t_k f_i(x^k) \ge 0

e.g.,

Δtk=910mini[xikfi(xk)]\Delta t_k = \frac{9}{10} \min_{i} \left[ \frac{-x_i^k}{f_i(x^k)} \right]

The factor 9/109/10 ensures the strict inequality xk+1>0x^{k+1} > 0.

References
  1. Kincaid, D., & Cheney, W. (2002). Numerical Analysis: Mathematics of Scientific Computing (3rd ed.). American Mathematics Society.