Convex Optimization - Differentiable Function



Let S be a non-empty open set in Rn,then f:SR is said to be differentiable at ˆxS if there exist a vector f(ˆx) called gradient vector and a function α:RnR such that

f(x)=f(ˆx)+f(ˆx)T(xˆx)+x=ˆxα(ˆx,xˆx),xS where

α(ˆx,xˆx)0f(ˆx)=[fx1fx2...fxn]Tx=ˆx

Theorem

let S be a non-empty, open convexset in Rn and let f:SR be differentiable on S. Then, f is convex if and only if for x1,x2S,f(x2)T(x1x2)f(x1)f(x2)

Proof

Let f be a convex function. i.e., for x1,x2S,λ(0,1)

f[λx1+(1λ)x2]λf(x1)+(1λ)f(x2)

f[λx1+(1λ)x2]λ(f(x1)f(x2))+f(x2)

λ(f(x1)f(x2))f(x2+λ(x1x2))f(x2)

λ(f(x1)f(x2))f(x2)+f(x2)T(x1x2)λ+

λ(x1x2)α(x2,λ(x1x2)f(x2))

where α(x2,λ(x1x2))0 asλ0

Dividing by λ on both sides, we get −

f(x1)f(x2)f(x2)T(x1x2)

Converse

Let for x1,x2S,f(x2)T(x1x2)f(x1)f(x2)

To show that f is convex.

Since S is convex, x3=λx1+(1λ)x2S,λ(0,1)

Since x1,x3S, therefore

f(x1)f(x3)f(x3)T(x1x3)

f(x1)f(x3)f(x3)T(x1λx1(1λ)x2)

f(x1)f(x3)(1λ)f(x3)T(x1x2)

Since, x2,x3S therefore

f(x2)f(x3)f(x3)T(x2x3)

f(x2)f(x3)f(x3)T(x2λx1(1λ)x2)

f(x2)f(x3)(λ)f(x3)T(x1x2)

Thus, combining the above equations, we get −

λ(f(x1)f(x3))+(1λ)(f(x2)f(x3))0

f(x3)λf(x1)+(1λ)f(x2)

Theorem

let S be a non-empty open convex set in Rn and let f:SR be differentiable on S, then f is convex on S if and only if for any x1,x2S,(f(x2)f(x1))T(x2x1)0

Proof

let f be a convex function, then using the previous theorem −

f(x2)T(x1x2)f(x1)f(x2) and

f(x1)T(x2x1)f(x2)f(x1)

Adding the above two equations, we get −

f(x2)T(x1x2)+f(x1)T(x2x1)0

(f(x2)f(x1))T(x1x2)0

(f(x2)f(x1))T(x2x1)0

Converse

Let for any x1,x2S,(f(x2)f(x1))T(x2x1)0

To show that f is convex.

Let x1,x2S, thus by mean value theorem, f(x1)f(x2)x1x2=f(x),x(x1x2)x=λx1+(1λ)x2 because S is a convex set.

f(x1)f(x2)=(f(x)T)(x1x2)

for x,x1, we know −

(f(x)f(x1))T(xx1)0

(f(x)f(x1))T(λx1+(1λ)x2x1)0

(f(x)f(x1))T(1λ)(x2x1)0

f(x)T(x2x1)f(x1)T(x2x1)

Combining the above equations, we get −

f(x1)T(x2x1)f(x2)f(x1)

Hence using the last theorem, f is a convex function.

Twice Differentiable function

Let S be a non-empty subset of Rn and let f:SR then f is said to be twice differentiable at ˉxS if there exists a vector f(ˉx),anXn matrix H(x)calledHessianmatrix and a function α:RnR such that f(x)=f(ˉx+xˉx)=f(ˉx)+f(ˉx)T(xˉx)+12(xˉx)H(ˉx)(xˉx)

where α(ˉx,xˉx)Oasxˉx

Advertisements