tài liệu về optimization
Chapter 1. Convex sets and convex functions taking the innity value
Chapter 2. Topological properties for sets and functions
Chapter 3. Duality for sets and functions
Chapter 4. Subdierential calculus for convex functions
Chapter 5. Duality in convex optimization
108 trang |
Chia sẻ: aloso | Lượt xem: 2136 | Lượt tải: 2
Bạn đang xem trước 20 trang tài liệu Tài liệu về optimization, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
ions
Relative Interior of a Convex Set
Definition. Let C be a nonempty convex subset of IRn. The relative
interior of C is the largest open set for the topology induced on aff C
that is contained in C . This set is denoted ri C .
We have that the relative interior ri C of C is the interior of C for the
topology relative to the affine hull of C
x ∈ ri C ⇔ x ∈ aff C and ∃δ > 0 such that B(x , δ) ∩ aff C ⊆ C
Furthermore, if int C 6= ∅, then aff C = IRn and ri C = int C
Proposition. Let C be a nonempty subset of IRn. Then the relative
interior ri C is nonempty.
tvnguyen (University of Science) Convex Optimization 34 / 108
Chapter 2 Topological properties for sets and functions
Examples
1. Let C = {x}. Then aff C = {x}, int C = ∅ and ri C = {x}.
2. Let C = [a, b] where a, b ∈ IRn with a 6= b and n ≥ 2. Then aff C is
the straight line generated by a and b, int C = ∅ and ri C =]a, b[.
3. Let C = {x ∈ IRn | ∑ni=1 xi = 1, xi ≥ 0, i = 1, . . . , n}. Then
int C = ∅ and
aff C = {x ∈ IRn |
n∑
i=1
xi = 1}
ri C = {x ∈ IRn |
n∑
i=1
xi = 1, xi > 0, i = 1, . . . , n}
For example, in IR2, let C be the triangle of vertices (1, 0, 0), (0, 1, 0),
and (0, 0, 1). Then the relative interior of the triangle is the interior of
the triangle.
tvnguyen (University of Science) Convex Optimization 35 / 108
Chapter 2 Topological properties for sets and functions
Continuity and locally Lipschitz continuity
Definition. Let f : IRn → IR ∪ {+∞} and x ∈ ri dom f . The function
f is continuous at x if for all ε > 0 there exists δ > 0 such that
∀y ∈ B(x , δ) ∩ aff domf |f (y)− f (x)| < ε.
The function f is locally Lipschitz continuous at x ∈ ri domf if there
exists δ > 0 and L > 0 such that
∀y , z ∈ B(x , δ) ∩ aff domf |f (y)− f (z)| ≤ L‖y − z‖
tvnguyen (University of Science) Convex Optimization 36 / 108
Chapter 2 Topological properties for sets and functions
Proposition. Let f : IRn → IR ∪ {+∞} be proper convex. Then
f is locally Lipschitz continuous on ri dom f
In particular f is continuous on ri dom f .
Corollary. Let f : IRn → IR be convex. Then f is continuous and
locally Lipschitz continuous on IRn.
Example. Any norm on IRn is convex and continuous on IRn
tvnguyen (University of Science) Convex Optimization 37 / 108
Chapter 2 Topological properties for sets and functions
Lower semi continuity
Definition. Let f : IRn → IR ∪ {+∞} be proper. f is said to be lower
semi continuous (l.s.c) at x ∈ IRn if
lim inf
y→x f (y) ≥ f (x) (⇔ ∀{yk} → x lim infy→x f (yk) ≥ f (x))
Proposition. Let f : IRn → IR ∪ {+∞}. The following three
properties are equivalent :
(i) f is l.s.c on IRn
(ii) epi f is a closed subset of IRn × IR
(iii) the sublevel sets Sr (f ) = {x ∈ Rn|f (x) ≤ r} are closed (possibly
empty) for all r ∈ IR
A function satisfying one of these three properties is also called a closed
function.
tvnguyen (University of Science) Convex Optimization 38 / 108
Chapter 2 Topological properties for sets and functions
Illustration
tvnguyen (University of Science) Convex Optimization 39 / 108
Chapter 2 Topological properties for sets and functions
Closure or l.s.c hull of a convex function
Definition. The closure (or l.s.c hull) of a convex function f is the
function cl f : IRn → IR ∪ {+∞} defined by
epi cl f = cl epi f
Proposition. The closure of a convex function f : IRn → IR ∪ {+∞}
is the supremum of all affine functions minorizing f :
cl f (x) = sup
(s,b)∈IRn×IR
{〈s, x〉 − b : 〈s, y〉 − b ≤ f (y) for all y ∈ IRn}
Proposition. Let f : IRn → IR ∪ {+∞} be proper convex. Then
(i) cl f : IRn → IR ∪ {+∞} is proper convex,
(ii) cl f and f coincide on ri dom f .
tvnguyen (University of Science) Convex Optimization 40 / 108
Chapter 2 Topological properties for sets and functions
Examples
Given a nonempty subset C ⊂ IRn, the indicator function δC is defined
by
δC (x) =
{
0 if x ∈ C
+∞ if not
Clearly δC is [closed and] convex if and only if C is [closed and]
convex. Indeed epi δC = C × IR+
Let (s1, b1), . . . , , (sm, bm) be m elements of IR
n × IR. Then the
function f (x) = max {〈sj , x〉 − bj | j = 1, . . . ,m} is called piecewise
linear. It is a closed and convex function.
A polyhedral function f is a function whose epigraph is a closed
convex polyhedron :
epi f = {(x , r) ∈ IRn × IR | 〈sj , x〉+ αj r ≤ bj for j ∈ J} where J is
finite, the (s, α, b)j being given in IR
n × IR× IR, (sj , αj) 6= 0. Such a
function is closed and convex.
tvnguyen (University of Science) Convex Optimization 41 / 108
Chapter 2 Topological properties for sets and functions
Asymptotic cone of a convex set
Definition. The asymptotic cone (or recession cone) of the closed
convex set C is the closed convex cone defined for all x ∈ C by
C∞(x) = {d ∈ IRn : x + t d ∈ C for all t > 0}
Proposition. The closed convex cone C∞ does not depend on x ∈ C .
Proposition. A closed convex set C is compact if and only if
C∞ = {0}
tvnguyen (University of Science) Convex Optimization 42 / 108
Chapter 2 Topological properties for sets and functions
Illustration
tvnguyen (University of Science) Convex Optimization 43 / 108
Chapter 2 Topological properties for sets and functions
Asymptotic function of a convex function
Definition. Let f : IRn → IR ∪ {+∞} be proper l.s.c. convex. The
asymptotic cone of epi f is the epigraph of the function f ′∞ defined by
d 7→ f ′∞(d) := sup
t>0
f (x0 + td)− f (x0)
t
= lim
t→+∞
f (x0 + td)− f (x0)
t
where x0 is arbitrary in domf .
f ′∞ is proper l.s.c. convex and called the asymptotic function (or
recession function) of f .
Example. The asymptotic function (δC )
′∞ = δC∞
tvnguyen (University of Science) Convex Optimization 44 / 108
Chapter 2 Topological properties for sets and functions
Property
Proposition. Let f : IRn → IR ∪ {+∞} be proper l.s.c. convex. The
following statements are equivalent :
(i) There is r for which the sublevel set Sr (f ) is nonempty and compact
(ii) All the sublevel sets of f are compact
(iii) f ′∞(d) > 0 for all nonzero d ∈ IRn
The sublevel set of f at level r is Sr (f ) = {x ∈ IRn | f (x) ≤ r}
tvnguyen (University of Science) Convex Optimization 45 / 108
Chapter 3. Duality for sets and functions
Chapter 3.
Duality for sets and functions
tvnguyen (University of Science) Convex Optimization 46 / 108
Chapter 3. Duality for sets and functions
Dual representation of convex sets
Several basic geometrical objects in IRn can be described by using linear
forms. For example
A closed hyperplane H can be written
H = {x ∈ IRn | 〈p, x〉 = α}
for some p ∈ IRn, p 6= 0, and α ∈ IR.
Similarly, a closed half-space H can be written
H = {x ∈ IRn | 〈p, x〉 ≤ α}
We will show that arbitrary closed convex sets in IRn can be described by
using only linear forms. This is what we call a dual representation.
This theory is based on the Hahn-Banach theorem.
tvnguyen (University of Science) Convex Optimization 47 / 108
Chapter 3. Duality for sets and functions
Closest point theorem
A well-known geometric fact is that, given a closed convex set A and a
point x 6∈ A, there exists a unique point y ∈ A¸ with minimum distance
from x .
Theorem. (Closest Point Theorem) Let A be a nonempty, closed
convex set in IRn and x 6∈ A. Then, there exists a unique point y ∈ A
with minimum distance from x . Furthermore, y is the minimizing point,
or closest point to x , if and only if 〈x − y , z − y〉 ≤ 0 for all z ∈ A.
A
z
y
x
tvnguyen (University of Science) Convex Optimization 48 / 108
Chapter 3. Duality for sets and functions
Separation of convex sets
Almost all optimality conditions and duality relationships use some sort of
separation or support of convex sets.
Definition. (Separation of Sets) Let S1 and S2 be nonempty sets in
IRn. A hyperplane H = {x |〈p, x〉 = α} separates S1 and S2 if 〈p, x〉 ≥ α
for each x ∈ S1 and 〈p, x〉 ≤ α for each x ∈ S2.
If, in addition, 〈p, x〉 ≥ α + ε for each x ∈ S1 and 〈p, x〉 ≤ α for each
x ∈ S2, where ε is a positive scalar, then the hyperplane H is said to
strongly separate the sets S1 and S2.
Notice that strong separation implies separation of sets.
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Se
pa
rat
ion
St
ron
gs
ep
ara
tio
n
S 1
S 1
S 2
S 2
H
H
tvnguyen (University of Science) Convex Optimization 49 / 108
Chapter 3. Duality for sets and functions
The following is the most fundamental separation theorem.
Theorem. (Separation Theorem) Let A be a nonempty closed convex
set in IRn and x 6∈ A. Then, there exists a nonzero vector p and a scalar
α such that 〈p, x〉 > α and 〈p, x〉 ≤ α for each z ∈ A.
A
z
y
x
Here p = x − y .
tvnguyen (University of Science) Convex Optimization 50 / 108
Chapter 3. Duality for sets and functions
A direct consequence
Proposition. Let A be a nonempty closed convex set in IRn. Then A is
equal to the intersection of all closed half-spaces that contain it :
A = ∩{Hp≤α |A ⊂ Hp≤α }
where Hp≤α = {x ∈ IRn | 〈p, x〉 ≤ α}
In such a representation of a closed convex set, it is natural to look for the
simplest representation.
Observe :
(a) α′ ≥ α and A ⊂ Hp≤α ⇒ A ⊂ Hp≤α′
(b) fixing p 6= 0 and making α vary gives rise to parallel hyperplanes
tvnguyen (University of Science) Convex Optimization 51 / 108
Chapter 3. Duality for sets and functions
Support function
Question. For a given p ∈ IRn,p 6= 0, such that A ⊂ Hp≤α for some
α ∈ IR, what is the intersection of all the parallel half-spaces containing A.
Proposition. For a given p ∈ IRn,p 6= 0, such that A ⊂ Hp≤α for some
α ∈ IR, the intersection of all the parallel half-spaces containing A is the
closed half-space Hp≤σA(p), where
σA(p) = sup {〈p, x〉 | x ∈ A}.
Definition. Let A be a nonempty subset of IRn. The support function
of A is defined by
σA : IR
n → IR ∪ {+∞} σA(p) = sup
x∈A
tvnguyen (University of Science) Convex Optimization 52 / 108
Chapter 3. Duality for sets and functions
Support function
As a direct consequence of the definition we obtain the dual representation
of a closed convex set.
Proposition. Let A be a closed convex set in IRn. Then A is
completely determined by its support function, i.e.,
A = {x | 〈p, x〉 ≤ σA(p), ∀p ∈ IRn}
Here are some properties of the support function
Proposition. The support function σA : IR
n → IR ∪ {+∞} of a closed
convex nonempty subset A is a function which is proper, closed, convex,
and positively homogeneous of degree 1. Its epigraph is a closed convex
cone in IRn × IR. Furthermore, dom σA = IRn when A is bounded.
tvnguyen (University of Science) Convex Optimization 53 / 108
Chapter 3. Duality for sets and functions
Illustration
A
p
p
tvnguyen (University of Science) Convex Optimization 54 / 108
Chapter 3. Duality for sets and functions
Supporting hyperplane.
To have a sharper view of the dual generation of closed convex sets, it is
interesting to introduce the notion of supporting hyperplane. This is
related to the question :
Question : In the definition of σA(p) = supx∈A , is the supremum
attained ?
Definition. Let p ∈ IR, p 6= 0. The hyperplane
H = {x ∈ IRn | 〈p, x〉 = σA(p)}
is a supporting hyperplane of A at x∗ ∈ A if
the closed half-space Hp≤σA(p) contains A and
H = {x ∈ IRn | 〈p, x〉 = σA(p)} intersects A at x∗
Note that the intersection of H with A may contain some other points
tvnguyen (University of Science) Convex Optimization 55 / 108
Chapter 3. Duality for sets and functions
Illustration
A
A
(a
)
(b
)
(a) Several supporting hyperplanes to A ⊂ IR2
(b) Two supporting hyperplanes to A ⊂ IR3 with int A = ∅
tvnguyen (University of Science) Convex Optimization 56 / 108
Chapter 3. Duality for sets and functions
Supporting hyperplane. An equivalent definition
Definition. (Supporting Hyperplane at a Boundary Point) Let S be a
nonempty set in IRn, and let x¯ ∈ ∂S (the boundary of S).
A hyperplane H = {x |〈p, x − x¯〉 = 0} is called a supporting hyperplane
of S at x¯ if either 〈p, (x − x¯)〉 ≥ 0 for each x ∈ S , or else,
〈p, x − x¯〉 ≤ 0 for each x ∈ S .
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x¯
p
H
S
tvnguyen (University of Science) Convex Optimization 57 / 108
Chapter 3. Duality for sets and functions
Supporting hyperplane
A convex set has a supporting hyperplane at each boundary point.
Theorem. (Supporting Hyperplane) Let S be a nonempty convex set
in IRn, and let x¯ ∈ ∂S . Then there exists a hyperplane that supports S
at x¯ ; that is, there exists a nonzero vector p such that 〈p, (x − x¯)〉 ≤ 0
for each x ∈ clS .
As a corollary, we have a result similar to the Separation Theorem, where
S is not required to be closed.
Corollary. Let S be a nonempty convex set in IRn and x¯ 6∈ intS . Then
there is a nonzero vector p such that 〈p, (x − x¯)〉 ≤ 0 for each x ∈ clS .
tvnguyen (University of Science) Convex Optimization 58 / 108
Chapter 3. Duality for sets and functions
Separation of Disjoint Convex Sets
If two convex sets are disjoint, then they can be separated by a
hyperplane.
Theorem. (Separation of Two Disjoint Convex Sets) Let S1 and S2
be nonempty convex sets in IRn and suppose that S1 ∩ S2 is empty.
Then there exists a hyperplane that separates S1 and S2 ; that is, there
exists a nonzero vector p in IRn such that
inf{〈p, x〉|x ∈ S1} ≥ sup{〈p, x〉|x ∈ S2}.
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
S 1
S 2
H
tvnguyen (University of Science) Convex Optimization 59 / 108
Chapter 3. Duality for sets and functions
Separation of Nondisjoint Convex Sets
The previous result (Separation of Two Disjoint Convex Sets) holds true
even if the two sets have some points in common, as long as their interiors
are disjoint.
Corollary. Let S1 and S2 be nonempty convex sets in IRn. Suppose
that int S2 is not empty and that S1 ∩ int S2 is empty. Then, there
exists a hyperplane that separates S1 and S2 ; that is, there exists a
nonzero p such that
inf{〈p, x〉|x ∈ S1} ≥ sup{〈p, x〉 : x ∈ S2}
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
S 1
S 2
H
tvnguyen (University of Science) Convex Optimization 60 / 108
Chapter 3. Duality for sets and functions
Conjugate of a convex function. Motivation
Let f be proper, closed and convex. We know that f is the pointwise
supremum of the collection of all affine functions h such that h ≤ f .
Consider F ∗ = {(x∗, µ∗) ∈ IRn × IR | h(x) = 〈x , x∗〉 − µ∗ ≤ f (x) ∀x}.
Then
h(x) ≤ f (x) ∀x ⇔ µ∗ ≥ sup{〈x , x∗〉 − f (x)|x ∈ IRn}.
In that case, we have F ∗ = epi f ∗ where
f ∗(x∗) = sup{〈x , x∗〉 − f (x)|x ∈ IRn}
f ∗ is called the conjugate of f .
tvnguyen (University of Science) Convex Optimization 61 / 108
Chapter 3. Duality for sets and functions
Conjugate of a convex function
Definition. Let f : IRn → IR ∪ {+∞} be proper and lower bounded by
an affine function. Then the function f ∗ : IRn → IR ∪ {+∞} defined by
f ∗(x) = sup
y∈Rn
{ −f (y)}
is called the conjugate function of f .
Proposition. Let f : IRn → IR ∪ {+∞} be proper and lower bounded
by an affine function. Then the conjugate function of f is well defined
and is proper, closed and convex.
Proposition. The following properties hold :
(i) If f is proper and convex then f ∗∗ ≤ f
(ii) If f is proper, closed and convex then f ∗∗ = f .
tvnguyen (University of Science) Convex Optimization 62 / 108
Chapter 3. Duality for sets and functions
Infimal convolution, support function and conjugacy
Proposition. Let f1, . . . , fm be proper convex functions. Then
(f1 ⊕ · · · ⊕ fm)∗ = f ∗1 + · · ·+ f ∗m
(cl f1 + · · ·+ cl fm)∗ = cl (f ∗1 ⊕ · · · ⊕ f ∗m).
Furthermore, if the sets ri dom fi , i = 1, . . . ,m, have a point in common,
the closure operation can be omitted from the second formula.
Proposition. The indicator function and the support function of a
closed convex set are conjugate to each other.
tvnguyen (University of Science) Convex Optimization 63 / 108
Chapter 3. Duality for sets and functions
Polar of a convex set
It is well-known that when V is a vector subspace of IRn, any x can be
decomposed in a unique way as x1 + x2 with x1 ∈ V and x2 ∈ V⊥. More
precisely x = PV (x) + PV⊥(x).
Our aim is to replace V by a closed convex cone. So what is V⊥ ?
Definition. Let K be a nonempty convex set. The polar of K is
K 0 = {x∗ ∈ IRn | 〈x∗, x〉 ≤ 1 for all x ∈ K}.
Proposition. If K is a convex cone, then the polar of K is the cone
K 0 = {x∗ | 〈x∗, x〉 ≤ 0 for all x ∈ K}.
Furthermore, K 00 is the closure of K . In particular if K is closed, then
K 00 = K .
tvnguyen (University of Science) Convex Optimization 64 / 108
Chapter 3. Duality for sets and functions
Illustration and main property
Proposition. (J.J. Moreau) Let K be a closed convex cone. For the
three elements x , x1, x2 in IR
n, the properties below are equivalent :
(i) x = x1 + x2 with x1 ∈ K , x2 ∈ K 0 and 〈x1, x2〉 = 0 ;
(ii) x1 = PK (x) and x2 = PK (x).
tvnguyen (University of Science) Convex Optimization 65 / 108
Chapter 4. Subdifferential calculus for convex functions
Chapter 4.
Subdifferential calculus for convex functions
tvnguyen (University of Science) Convex Optimization 66 / 108
Chapter 4. Subdifferential calculus for convex functions
Directional derivative
Definition. Let f : IRn → IR ∪ {+∞}, x ∈ domf and d ∈ IRn. The
directional derivative of f at x in the direction d is
f ′(x , d) = lim
t↓0
f (x + td)− f (x)
t
if the limit exists (-∞ and ∞ are allowed)
Proposition. Let f : IRn → IR ∪ {+∞} be a convex function, and let
x ∈ dom f . For each d , the difference quotient in the definition of
f ′(x , d) is a non-decreasing function of t > 0, so that
f ′(x , d) = inf
t>0
f (x + t d)− f (x)
t
If f is differentiable at x , then f ′(x , d) = 〈∇f (x), d〉.
tvnguyen (University of Science) Convex Optimization 67 / 108
Chapter 4. Subdifferential calculus for convex functions
Subdifferential. Motivation
To obtain the simplest possible dual representation of a closed convex set
C , we have introduced the notion of supporting hyperplane.
When taking C the epigraph of a closed convex proper function f , the
corresponding notion is the exact minoration : an affine function
l : IRn → IR is an exact minorant of f at x if l ≤ f and l(x) = f (x)
Setting l(y) = 〈x∗, y〉+ α, this becomes
f (y) ≥ 〈x∗, y〉+ α ∀y ∈ IRn, and f (x) = 〈x∗, x〉+ α
which is equivalent to (α = f (x)− 〈x∗, x〉)
∀y ∈ IRn f (y) ≥ f (x) + 〈x∗, y − x〉
This leads to the following definition.
tvnguyen (University of Science) Convex Optimization 68 / 108
Chapter 4. Subdifferential calculus for convex functions
Subgradient. Subdifferential
Definition. Let f : IRn → IR ∪ {+∞} be a convex function. A vector
x∗ is said to be a subgradient of f at x if
∀y ∈ IRn f (y) ≥ f (x) + 〈x∗, y − x〉.
The set of subgradients of f at x is called the subdifferential of f at x
and is denoted by ∂f (x).
Proposition. The subdifferential ∂f (x) is a closed convex set. It may
be empty.
tvnguyen (University of Science) Convex Optimization 69 / 108
Chapter 4. Subdifferential calculus for convex functions
Examples
f (x) = |x |
∂f (0) = [−1, 1], ∂f (x) = {1} if x > 0, ∂f (x) = −1 if x < 0
f (x) = ex − 1 if x ≥ 0 and 0 if x < 0
∂f (0) = [0, 1], ∂f (x) = {ex} if x > 0, ∂f (x) = 0 if x < 0
x
|x|
x
ex − 1
0
tvnguyen (University of Science) Convex Optimization 70 / 108
Chapter 4. Subdifferential calculus for convex functions
Proposition. Let f : IRn → IR ∪ {+∞} be a convex function and let
x ∈ dom f . Then x∗ is a subgradient of f at x if and only if
f ′(x , d) ≥ 〈x∗, d〉, ∀d .
In fact, the closure of f ′(x , d) as a convex function of d is the support
function of the closed convex set ∂f (x).
Proposition. Let f : IRn → IR ∪ {+∞} be a convex function. Then
∂f (x) is empty when x 6∈ domf , and nonempty when x ∈ ri(domf ). In
that case f ′(x , y) is closed and proper as a function of y , and
f ′(x , d) = sup{ |x∗ ∈ ∂f (x)} = δ∗(d |∂f (x)).
Finally, ∂f (x) is a non-empty bounded set if and only if x ∈ int(domf ),
in which case f ′(x , d) is finite for every d .
tvnguyen (University of Science) Convex Optimization 71 / 108
Chapter 4. Subdifferential calculus for convex functions
Proposition. For any proper convex function f and any vector x , the
following four conditions on a vector x∗ are equivalent to each other :
(a) x∗ ∈ ∂f (x)
(b) 〈x∗, z〉 − f (z) achieves its supremum in z at z = x
(c) f (x) + f ∗(x∗) ≤ 〈x∗, x〉
(d) f (x) + f ∗(x∗) = 〈x∗, x〉
If (cl f )(x) = f (x), three more conditions can be added to this list
(e) x ∈ ∂f ∗(x∗)
(f) 〈z∗, x〉 − f ∗(z∗) achieves its supremum in z∗ at z∗ = x∗
(g) x∗ ∈ ∂(clf )(x)
tvnguyen (University of Science) Convex Optimization 72 / 108
Chapter 4. Subdifferential calculus for convex functions
Subgradient and monotonicity
Definition. An operator T : IRn → 2IRn is said to be monotone, if for
any x , y and any x∗, y∗ belonging to Tx and Ty respectively,
〈y∗ − x∗, y − x〉 ≥ 0.
T is said to be maximal monotone if it is monotone and its graph
cannot be properly included into the graph of another monotone
operator on IRn
Proposition. Let f : IRn → IR ∪ {+∞} be proper, closed and convex
function. Then ∂f (x) is a maximal monotone mapping.
tvnguyen (University of Science) Convex Optimization 73 / 108
Chapter 4. Subdifferential calculus for convex functions
Derivability of convex functions
Let f be a function from IRn to [−∞,+∞], and let x be a point where f
is finite. According to the usual definition, f is differentiable at x if and
only if there exists a vector x∗ (necessarily unique) with the property that
f (y) = f (x) + 〈x∗, y − x〉+ o(|y − x |)
or in other words
lim
y→x
f (y)− f (x)− 〈x∗, y − x〉
|y − x | = 0.
Such a x∗, if it exists, is called the gradient of f at x and is denoted by
∇f (x)
tvnguyen (University of Science) Convex Optimization 74 / 108
Chapter 4. Subdifferential calculus for convex functions
Proposition. Let f be a convex function, and let x be a point where
f is finite. If f is differentiable at x , then ∇f (x) is the unique
subgradient of f at x , so far in particular
∀y ∈ IRn f (y) ≥ f (x) + 〈∇f (x), y − x〉.
Conversely, if f has a unique subgradient at x , then f is differentiable at
x and ∂f (x) = {∇f (x)}.
Proposition. Let f be a convex function on IRn, and let x be a point
at which f is finite. A necessary and sufficient condition for f to be
differentiable at x is that the directional derivative function f ′(x , ·) be
linear. Moreover, this condition is satisfied if merely the n two-sided
partial derivatives ∂f (x)/∂ξj exist at x and are finite.
tvnguyen (University of Science) Convex Optimization 75 / 108
Chapter 4. Subdifferential calculus for convex functions
Proposition. Let f be a proper convex function on IRn, and let D be
the set of points where f is differentiable. Then D is a dense subset of
int(dom f ), and its complement in int(dom f ) is a set of measure zero.
Furthermore, the gradient mapping ∇f : x 7→ ∇f (x) is continuous on D.
Finally
∂f (x) = conv {x∗ ∈ IRn | ∃{xi} ⊂ D with xi → x and ∇f (xi )→ x∗}
tvnguyen (University of Science) Convex Optimization 76 / 108
Chapter 4. Subdifferential calculus for convex functions
Calculus rules
Proposition. Let f1, f2 be two proper closed convex functions on IR
n.
For x ∈ dom f1∩ dom f2, there holds
∂(f1 + f2)(x) ⊃ ∂f1(x) + ∂f2(x)
with equality when ri dom f1∩ ri dom f2 6= ∅ or int dom f1∩ dom f2 6= ∅.
Proposition. Let f1, f2 be two proper closed convex functions on IR
n.
Let f = f1 ⊕ f2 and x = y1 + y2 with yi ∈ dom fi , i = 1, 2. Then
∂f1(y1) ∩ ∂f2(y2) ⊂ ∂f (x)
If ∂f1(y1) ∩ ∂f2(y2) 6= ∅, the inf-convolution is exact at x = y1 + y2 and
equality holds.
tvnguyen (University of Science) Convex Optimization 77 / 108
Chapter 4. Subdifferential calculus for convex functions
Moreau-Yosida Regularization
Definition. Let c > 0 and let f : IRn → IR ∪ {+∞} be a proper closed
convex function. The function fc = f ⊕ c2‖ · ‖2 is called the
Moreau-Regularization of f . One has
fc(x) = min {f (y) + c
2
‖x − y‖2 | y ∈ IRn}
Denote by xc the unique minimum y above, characterized by
0 ∈ ∂f (xc) + c(xc − x)
It can be proven that fc is a differentiable convex function and that its
gradient is given by
∇fc(x) = c(x − xc)
tvnguyen (University of Science) Convex Optimization 78 / 108
Chapter 4. Subdifferential calculus for convex functions
Calculus rules
Let {fj}j∈J be a collection of convex functions from IRn into IR and let
f : IRn → IR ∪ {+∞} be defined, for all x , by
f (x) = sup
j∈J
fj(x).
Proposition. Assume f is finite. Then the following inclusion holds
for every x ∈ IRn :
∂ f (x) ⊃ conv {∪ ∂fj(x) | j ∈ J(x) },
where J(x) = { j ∈ J | fj(x) = f (x) }.
To obtain the equality, we suppose that J is a finite subset. In that case
the set J(x) is nonempty.
tvnguyen (University of Science) Convex Optimization 79 / 108
Chapter 4. Subdifferential calculus for convex functions
Calculus rules
Proposition. If J = { 1, . . . ,m }, then
∂ f (x) = conv {∪ ∂fj(x) | j ∈ J(x) }.
Corollary. Let f1, . . . , fm be differentiable convex functions from IRn into
IR . Then f = sup1≤j≤m fj is convex and for all x ∈ IRn,
∂f (x) = conv {∇fj(x) | j ∈ J(x) }.
Example : Consider the function f (x) = max{f1(x), f2(x), f3(x)} where
f1(x) = −x1 − x2, f2(x) = −x1 + x2 and f3(x) = x1,
and the point (4, 8). Since J((4, 8)) = {2, 3}, ∇f2(4, 8) = (−1, 1)T and
∇f3(4, 8) = (1, 0)T , we have
∂f (4, 8) = conv {(−1, 1)T , (1, 0)T}.
tvnguyen (University of Science) Convex Optimization 80 / 108
Chapter 5 Duality in convex optimization
Chapter 5.
Duality in convex optimization
tvnguyen (University of Science) Convex Optimization 81 / 108
Chapter 5 Duality in convex optimization
The Fermat Rule
Proposition. Let f : IRn → IR ∪ {+∞} be a closed convex and proper
function. Then, for an element x∗ ∈ IRn the two following statements
are equivalent :
(i) f (x∗) ≤ f (x) for all x ∈ IRn
(ii) 0 ∈ ∂f (x∗)
The necessary and sufficient condition 0 ∈ ∂f (x∗) is an extension of the
classical optimality condition for convex C 1 functions : ∇f (x∗) = 0.
So finding the optimal solutions of f can be attacked by solving the
generalized equation 0 ∈ ∂f (x)
tvnguyen (University of Science) Convex Optimization 82 / 108
Chapter 5 Duality in convex optimization
The constrained convex problem
Consider the following optimization problem
(P) min {f0(x) | x ∈ C}
where f0 : IR
n → IR ∪ {+∞} is a closed convex and proper function (called
the objective function) and C is a closed convex nonempty subset of IRn
(set of constraints). Assume dom f0 ∩ C 6= ∅.
Setting f = f0 + δC , this problem can be written in the equivalent form
min {f (x) | x ∈ IRn}
When dom f0 ∩ int C 6= ∅, we have ∂f (x) = ∂f0(x) + ∂δC (x). So
x∗ optimal solution of (P) ⇔ 0 ∈ ∂f0(x∗) + ∂δC (x∗)
To describe ∂δC (x), we need to introduce the notion of tangent and
normal cone to C at x .
tvnguyen (University of Science) Convex Optimization 83 / 108
Chapter 5 Duality in convex optimization
The tangent and normal cones
Definition. Let C be a closed convex nonempty subset of IRn and let
x ∈ C .
(a) The tangent cone to C at x , denoted TC (x) is defined by
TC (x) = ∪λ≥0 λ (C − x)
It is the closure of the cone spanned by C − x .
(b) The normal cone NC (x) to C at x is the polar cone of TC (x) :
NC (x) = {x∗ ∈ IRn | 〈x∗, y〉 ≤ 0 ∀y ∈ TC (x)}
= {x∗ ∈ IRn | 〈x∗, y − x〉 ≤ 0 ∀y ∈ C}
tvnguyen (University of Science) Convex Optimization 84 / 108
Chapter 5 Duality in convex optimization
Illustration
Tangent cones Normal cones
tvnguyen (University of Science) Convex Optimization 85 / 108
Chapter 5 Duality in convex optimization
Properties
Proposition. Let C be a closed convex nonempty subset of IRn and let
x ∈ C . Then
(i) TC (x) is a closed convex cone containing 0
(ii) TC (x) = IR
n when x ∈ int C
(iii) NC (x) is a closed convex cone containing 0
(iv) NC (x) = {0} when x ∈ int C
Proposition. Let C be a closed convex nonempty subset of IRn and let
x ∈ C . Then
∂δC (x) = NC (x)
tvnguyen (University of Science) Convex Optimization 86 / 108
Chapter 5 Duality in convex optimization
The constrained convex problem
Consider again the following optimization problem
(P) min {f0(x) | x ∈ C}
where f0 : IR
n → IR ∪ {+∞} is a closed convex and proper function and C
is a closed convex nonempty subset of IRn.
Proposition. Assume that the following qualification assumption is
satisfied :
dom f0 ∩ int C 6= ∅
Then the following statements are equivalent :
(i) x∗ is an optimal solution to (P) ;
(ii) x∗ is a solution to the equation 0 ∈ ∂f0(x∗) + NC (x∗) ;
(iii) x∗ ∈ C and ∃ s ∈ ∂f0(x∗) such that 〈s, x − x∗〉 ≥ 0 ∀x ∈ C
tvnguyen (University of Science) Convex Optimization 87 / 108
Chapter 5 Duality in convex optimization
The mathematical programming problem
Consider the problem
(P)
{
min f (x)
s.t. gi (x) ≤ 0, i = 1, . . . ,m
where f : IRn → IR ∪ {+∞} is closed convex and proper, and
g1, . . . , gm : IR
n → IR, are convex.
Here the constraint C has the following specific form
C = { x ∈ IRn | gi (x) ≤ 0, i = 1, . . . ,m}
This problem is of fundamental importance : a large number of problems
in decision sciences, engineering, and so forth can be written as
mathematical programming problems.
tvnguyen (University of Science) Convex Optimization 88 / 108
Chapter 5 Duality in convex optimization
NC (x) when C = {x ∈ IRn | g(x) ≤ 0}
Proposition. Let C = {x ∈ IRn | g(x) ≤ 0} where g : IRn → IR is
convex (and thus also continuous). Assume that C satisfies the
following Slater property :
there exists some x0 ∈ C such that g(x0) < 0
Then, for every x ∈ C
NC (x) =
{ {0} if g(x) < 0,
IR+ ∂g(x) if g(x) = 0.
As a consequence,
s ∈ NC (x) ⇔ ∃λ ≥ 0 such that s ∈ λ∂g(x) and λg(x) = 0
tvnguyen (University of Science) Convex Optimization 89 / 108
Chapter 5 Duality in convex optimization
NC (x) when C = {x ∈ IRn | gi (x) ≤ 0, i = 1, . . . , m}
Proposition. Let C = ∩1≤i≤mCi where for each i = 1, . . . ,m
Ci = {x ∈ IRn | gi (x) ≤ 0} and gi : IRn → IR, i = 1, . . . ,m is convex.
Assume that C satisfies the following Slater property :
there exists some x0 ∈ C such that gi (x0) < 0, i = 1, . . . ,m
Then x0 ∈ ∩i int Ci , δC = δC1 + · · ·+ δCm , and (by the subdifferential
rule for the sum of convex functions)
∂δC = ∂δC1 + · · ·+ ∂δCm
As a consequence, for every x ∈ C ,
NC (x) = NC1(x) + · · ·+ NCm(x)
tvnguyen (University of Science) Convex Optimization 90 / 108
Chapter 5 Duality in convex optimization
Karush-Kuhn-Tucker optimality conditions
Theorem. Suppose that f : IRn → IR ∪ {+∞} is closed convex and
proper and that g1, . . . , gm : IR
n → IR are convex. Suppose also that the
Slater qualification assumption is satisfied :
∃ x0 ∈ IRn such that f (x0) < +∞ and gi (x0) < 0 ∀i = 1, . . . ,m
Then the following statements are equivalent :
(i) x∗ is a solution of problem (P) ;
(ii) there exist λ1, . . . , λm such that
0 ∈ ∂f (x∗) + λ1 ∂g1(x∗) + · · ·+ λm ∂gm(x∗),
λi ≥ 0, i = 1, . . . ,m
λi gi (x
∗) = 0, i = 1, . . . ,m
gi (x
∗) ≤ 0, i = 1, . . . ,m.
tvnguyen (University of Science) Convex Optimization 91 / 108
Chapter 5 Duality in convex optimization
Comments
Let us introduce the Lagrangian function
L(x , λ) = f (x) +
∑m
i=1 λigi (x)
Then the first KKT condition ⇔ 0 ∈ ∂x L(x∗, λ∗)
λ∗ is called the vector of Lagrange multipliers
The conditions λ∗i gi (x
∗) = 0 are called complementarity conditions :
either λ∗i = 0 or gi (x
∗) = 0
Given the sign of λ∗i , these conditions can be written :
(∀ i = 1, . . . ,m) (λ∗i > 0 ⇒ gi (x∗) = 0)
In other terms, a multiplier associated with an inactive constraint
(i.e., gi (x
∗) < 0) is equal to zero.
tvnguyen (University of Science) Convex Optimization 92 / 108
Chapter 5 Duality in convex optimization
Min-max duality
The basic concept is the concept of saddle point.
Definition 5.1.1 A Saddle Problem calls for a solution (x∗, y∗) of a
double inequality of the form :
F (x∗, y) ≤ F (x∗, y∗) ≤ F (x , y∗) for all (x , y) ∈ X × Y
where F : IRn × IRq → IR , X ⊆ IRn and Y ⊆ IRq are given.
A pair (x∗, y∗) that solves this Saddle Problem is called a Saddle Point
of F
tvnguyen (University of Science) Convex Optimization 93 / 108
Chapter 5 Duality in convex optimization
Example of a saddle point
−3
−2
−1
0
1
2
3
−3
−2
−1
0
1
2
3
−10
−5
0
5
10
(x∗, y∗) = (0, 0) is a saddle point of F (x , y) = x2 − y 2
tvnguyen (University of Science) Convex Optimization 94 / 108
Chapter 5 Duality in convex optimization
Saddle problem. Saddle point
When (x∗, y∗) is a saddle point of F , one has
x∗ ∈ arg min
x∈X
F (x , y∗) and y∗ ∈ arg max
y∈Y
F (x∗, y)
With every saddle problem are associated two optimization problems :
min
x∈X
p(x) ≡ sup
y∈Y
F (x , y) and max
y∈Y
d(y) ≡ inf
x∈X
F (x , y)
The two optimization problems can be expressed as
a minimax problem : infx∈X supy∈Y F (x , y),
and a maximin problem : supy∈Y infx∈X F (x , y)
tvnguyen (University of Science) Convex Optimization 95 / 108
Chapter 5 Duality in convex optimization
Characterization of saddle points
Theorem 5.1.1 Let X × Y ⊆ IRn × IRq and let F : X × Y → IR be a
given function. Then
sup
y∈Y
inf
x∈X
F (x , y) ≤ inf
x∈X
sup
y∈Y
F (x , y)
For (x∗, y∗) ∈ X × Y , the following conditions are equivalent :
(1) (x∗, y∗) is a saddle point of F on X × Y ,
(2) x∗ ∈ arg minx∈X p(x), y∗ ∈ arg maxy∈Y d(y), and
sup
y∈Y
inf
x∈X
F (x , y) = inf
x∈X
sup
y∈Y
F (x , y)
(3) p(x∗) = d(y∗) = F (x∗, y∗)
tvnguyen (University of Science) Convex Optimization 96 / 108
Chapter 5 Duality in convex optimization
Lagrangian duality
Consider the problem
(P)
{
min f (x)
s.t. gi (x) ≤ 0, i = 1, . . . ,m
where f , gi , i = 1, . . . ,m : IR
n → IR.
We denote p∗ the optimal value of (P).
In view of writing problem (P) under the form of a min-max problem,
we consider the Lagrangian function defined by
L(x , λ) = f (x) +
m∑
i=1
λigi (x)
tvnguyen (University of Science) Convex Optimization 97 / 108
Chapter 5 Duality in convex optimization
Lagrangian duality (I)
We use the min-max duality with X = IRn, Y = IRm+ and
F (x , λ) = L(x , λ)
So we have
p(x) = sup
λ∈Y
L(x , λ) and d(λ) = inf
x∈X
L(x , λ)
The corresponding optimization problems are :
(PP) inf
x∈X
sup
λ∈Y
L(x , λ)
and
(PD) sup
λ∈Y
inf
x∈X
L(x , λ)
tvnguyen (University of Science) Convex Optimization 98 / 108
Chapter 5 Duality in convex optimization
Lagrangian duality (II)
The primal function p(x) = supλ∈Y L(x , λ) is
p(x) = sup
λ∈IRm+
{
f (x) +
m∑
i=1
λigi (x)
}
p(x) = f (x) if x is feasible and +∞ otherwise
The minimax problem (PP) is nothing else than (P)
The problem (PD) is called the dual problem associated with (P)
The variables λ of (PD) are the dual variables
tvnguyen (University of Science) Convex Optimization 99 / 108
Chapter 5 Duality in convex optimization
The dual problem
Problem (PD) will be denoted (D) and written under the form :
(D)
{
max d(λ)
s.t. λ ≥ 0
The function d(λ) = infx∈IRn L(x , λ) is called the dual function
Proposition. The dual function d(λ) is a concave function
tvnguyen (University of Science) Convex Optimization 100 / 108
Chapter 5 Duality in convex optimization
Example
Consider the problem min f (x) = x2 subject to 1− x ≤ 0
The solution to this problem is x∗ = 1 with λ∗ = 2 as multiplier.
The Lagrangian function is L(x , λ) = x2 + λ(1− x)
The dual function is d(λ) = infx∈IR L(x , λ) = infx∈IR x2 + λ(1− x)
for λ ≥ 0.
The minimizer of the right-hand side is x = λ/2 so that the dual
function is d(λ) = λ− 14λ2 ∀λ ≥ 0
The dual problem is therefore maxλ≥0 λ− 14λ2
The solution is λ∗ = 2. The optimal value is d∗ = 1
Furthermore p∗ = d∗
tvnguyen (University of Science) Convex Optimization 101 / 108
Chapter 5 Duality in convex optimization
Weak duality Theorem
Proposition.
Let p∗ and d∗ be the optimal values of problems (P) and (D),
respectively. Then d∗ ≤ p∗. In particular if x is feasible for (P) and
λ ∈ IRm+, then d(λ) ≤ f (x)
Corollary. If x is feasible for (P), λ ∈ IRm+ and d(λ) = f (x), then x is a
solution of (P), λ is a solution of (D).
The dual is infeasible if d(λ) = −∞ for all λ ∈ IRm+
Corollary. If the primal problem is unbounded, then its dual is infeasible.
If the dual is unbounded, then the primal is infeasible
tvnguyen (University of Science) Convex Optimization 102 / 108
Chapter 5 Duality in convex optimization
When is the duality gap equal to zero ?
We have the following equivalent conditions :
(1) (x∗, λ∗) is a saddle point of L on IRn × IRm+ ,
(2) x∗ and λ∗ are solutions of (P) and (D), respectively ; p∗ = d∗
(3) f (x∗) = d(λ∗) = L(x∗, λ∗)
In other words, we have the following result :
Proposition. If (x∗, λ∗) is a saddle point of the Lagrangian function,
then
x∗ is a solution to (P), λ∗ is a solution to (D)
The strong duality property holds true : p∗ = d∗
(x∗, λ∗) satisfies the KKT conditions.
tvnguyen (University of Science) Convex Optimization 103 / 108
Chapter 5 Duality in convex optimization
Strong Duality Theorem I
Proposition. Assume that problem (P) is convex. Then (x∗, λ∗, µ∗)
satisfies (KKT) ⇒ (x∗, λ∗) is a saddle point of the Lagrangian function
As a consequence we obtain the strong duality theorem
Theorem. Assume that problem (P) is convex and that the Slater
condition is satisfied.
Let x∗ be a solution to (P). Then
the Lagrange multipliers λ∗ associated with x∗ are solution to (D)
the strong duality property p∗ = d∗ holds.
tvnguyen (University of Science) Convex Optimization 104 / 108
Chapter 5 Duality in convex optimization
Strong Duality Theorem II
Theorem. Assume that problem (P) is convex and that all the
constraints are affine. Let x∗ be a solution to (P). Then
the Lagrange multipliers λ∗ associated with x∗ are solution to (D)
the strong duality property p∗ = d∗ holds.
Duality theory is interesting when
the dual problem is easier to solve than the primal problem
it is possible to find a solution of the primal problem when a solution
of the dual is known.
This can be done when the strong duality property holds
tvnguyen (University of Science) Convex Optimization 105 / 108
Chapter 5 Duality in convex optimization
Solving the dual to get the solution of (P)
Let λ∗ be a solution to (D). In order to recover a primal solution from λ∗,
a strategy consists in finding x∗ such that (x∗, λ∗) is a saddle point of L.
Observe that this strategy implies that the strong duality property holds.
Proposition. Let λ∗ be a solution to the dual problem (D). Then every
point x∗ such that
(1) L(x∗, λ∗) = minx∈IRn L(x , λ∗)
(2) all the constraints of problem (P) are satisfied at x∗
(3) λ∗i gi (x
∗) = 0, i = 1, . . . ,m
is a solution to problem (P).
tvnguyen (University of Science) Convex Optimization 106 / 108
Chapter 5 Duality in convex optimization
Fenchel’s duality
Consider the problem of minimizing a difference f (x)− g(x) where f is a
proper convex function and −g is a proper convex funtion.
Particular case : minimizing f over convex set C (take g = −δC ).
The duality consists in the connection between minimizing f − g and
maximizing the concave function g∗ − f ∗.
tvnguyen (University of Science) Convex Optimization 107 / 108
Chapter 5 Duality in convex optimization
Proposition. Let f ,−g be proper convex functions. One has
inf
x
{f (x)− g(x)} = sup
x∗
{g∗(x∗)− f ∗(x∗)}
if either of the following conditions is satisfied
(a) ri dom f ∩ ri dom g 6= ∅,
(b) f and g are closed, and ri dom g∗∩ ri dom f ∗ 6= ∅
Under (a) the supremum is attained at some x∗, while under (b) the
infimum is attained at some x .
tvnguyen (University of Science) Convex Optimization 108 / 108
Các file đính kèm theo tài liệu này:
- Tài liệu về optimization.pdf