First-Order Conditions For Convexity

发布时间 2023-11-15 16:36:45作者: 马路野狼

Statement of the First-Order Condition for Convexity

For a differentiable function $ f: \mathbb{R}^n \to \mathbb{R} $, $ f $ is convex on a convex set $ C \subseteq \mathbb{R}^n $ if and only if for all $ \mathbf{x}, \mathbf{y} \in C $ the following inequality holds:

\[f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) \]

Proof

Part 1: Necessity

Assume $ f $ is convex. We need to show that the first-order condition holds.

  1. Convexity of $ f $: By definition of convexity, for any $ \mathbf{x}, \mathbf{y} \in C $ and $ \theta $ with $ 0 \leq \theta \leq 1 $, we have:

    \[f(\theta \mathbf{y} + (1 - \theta) \mathbf{x}) \leq \theta f(\mathbf{y}) + (1 - \theta) f(\mathbf{x}) \]

  2. Apply the Definition to a Point Slightly Away from $ \mathbf{x} $: Choose $ \theta $ to be a small positive number $ h $, and set $ \mathbf{z} = \mathbf{x} + h(\mathbf{y} - \mathbf{x}) $. Then, the above inequality becomes:

    \[f(\mathbf{x} + h(\mathbf{y} - \mathbf{x})) \leq hf(\mathbf{y}) + (1 - h)f(\mathbf{x}) \]

  3. Rearrange and Divide by $ h $:

    \[\frac{f(\mathbf{x} + h(\mathbf{y} - \mathbf{x})) - f(\mathbf{x})}{h} \leq f(\mathbf{y}) - f(\mathbf{x}) \]

  4. Take the Limit as $ h \to 0 $:

    \[\lim_{h \to 0} \frac{f(\mathbf{x} + h(\mathbf{y} - \mathbf{x})) - f(\mathbf{x})}{h} = \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) \]

    Therefore, $ f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) $.

Part 2: Sufficiency

Given:

The first-order condition states that for all $ \mathbf{x}, \mathbf{y} \in C $:

\[f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) \]

Goal:

To prove that $ f $ is convex, i.e., for any $ \mathbf{x}, \mathbf{y} \in C $ and for any $ \theta $ with $ 0 \leq \theta \leq 1 $, the following inequality holds:

\[f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) \leq \theta f(\mathbf{x}) + (1 - \theta) f(\mathbf{y}) \]

Proof Steps:

Step 1: Apply First-Order Condition with $ \mathbf{x} $ and $ \mathbf{y} $

For $ \mathbf{z} = \theta \mathbf{x} + (1 - \theta) \mathbf{y} $ and using $ \mathbf{y} $ in the first-order condition, we have:

\[f(\mathbf{y}) \geq f(\mathbf{z}) + \nabla f(\mathbf{z})^\top (\mathbf{y} - \mathbf{z}) \]

Step 2: Substitute $ \mathbf{z} $

\[f(\mathbf{y}) \geq f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{y} - (\theta \mathbf{x} + (1 - \theta) \mathbf{y})) \]

Step 3: Expand and Rearrange

\[f(\mathbf{y}) \geq f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \theta\nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{y} - \mathbf{x}) \]

Step 4: Multiply by $ (1 - \theta) $

\[(1 - \theta) f(\mathbf{y}) \geq (1 - \theta) f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \theta(1 - \theta) \nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{y} - \mathbf{x}) \]

Step 5: Apply First-Order Condition with $ \mathbf{x} $ and $ \mathbf{z} $

Similarly, applying the condition with $ \mathbf{x} $ and $ \mathbf{z} $:

\[f(\mathbf{x}) \geq f(\mathbf{z}) + \nabla f(\mathbf{z})^\top (\mathbf{x} - \mathbf{z}) \]

Step 6: Substitute $ \mathbf{z} $ and Multiply by $ \theta $

\[\theta f(\mathbf{x}) \geq \theta f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \theta(1 - \theta) \nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{x} - \mathbf{y}) \]

Step 7: Combine and Simplify

Add the inequalities from Steps 4 and 6:

\[\theta f(\mathbf{x}) + (1 - \theta) f(\mathbf{y}) \geq \theta f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + (1 - \theta) f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \text{[terms involving gradients]} \]

The terms involving the gradients will cancel

out, leaving:

\[\theta f(\mathbf{x}) + (1 - \theta) f(\mathbf{y}) \geq f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) \]