Calculus

Applications of the Derivative

Classical Inequalities

Some Classical Inequalities

Among all inequalities, there is a number of well-known classical inequalities. Many of them have been proved by famous mathematicians and named after them. These include, in particular, Bernoulli’s, Young’s, Hölder’s, Cauchy-Schwarz, and Minkowski’s inequalities (of course, this is not a complete list!).

Basic classical inequalities

Figure 1.

The relationships between the main classical inequalities can be represented in a tree diagram (Figure \(1\)), which shows that, for example, the Cauchy-Schwarz inequality follows from Hölder’s inequality, etc. Next, we take a closer look at each of the inequalities shown on the chart.

The Inequality \({\left( {1 + x} \right)^\alpha } \le 1 + \alpha x\) and Bernoulli’s Inequality

A source for the derivation of many classical inequalities is the simple inequality

\[{\left( {1 + x} \right)^\alpha } \le 1 + \alpha x,\]

where \(x \ge -1,\) \(0 \lt \alpha \lt 1.\)

This inequality can be proved using derivatives. Consider the function

\[f\left( x \right) = {\left( {1 + x} \right)^\alpha } – \alpha x – 1\]

provided \(x \ge -1.\) Differentiating it, we get

\[
{f’\left( x \right) = {\left[ {{{\left( {1 + x} \right)}^\alpha } – \alpha x – 1} \right]^\prime } }
= {\alpha {\left( {1 + x} \right)^{\alpha – 1}} – \alpha }
= {\alpha \left[ {{{\left( {1 + x} \right)}^{\alpha – 1}} – 1} \right].}
\]

It can be seen that the derivative is zero at \(x = 0.\) The sign of the derivative to the left and right from the point \(x = 0\) depends on the value of \(\alpha:\)

  1. If \(0 \lt \alpha \lt 1,\) the derivative \(f’\left( x \right)\) changes its sign from plus to minus when passing through the point \(x = 0.\) In this case we have a maximum at \(x = 0.\)
  2. If \(\alpha \lt 0\) or \(\alpha \gt 1,\) the derivative \(f’\left( x \right)\) changes sign from minus to plus when passing through the point \(x = 0.\) Therefore, this point is a minimum.

Thus, when \(x \gt -1,\) the function \(f\left( x \right)\) in the case \(1\) is decreasing, and in the case \(2\) it is increasing. Take into account that the function \(f\left( x \right)\) is zero at \(x = 0.\) Then the following inequalities are true provided \(x \ge -1:\)

  1. \(f\left( x \right) \le 0\) for \(0 \lt \alpha \lt 1\);
  2. \(f\left( x \right) \ge 0\) for \(\alpha \lt 0\) or \(\alpha \gt 1\);

or

  1. \({\left( {1 + x} \right)^\alpha } – \alpha x – 1 \le 0\) for \(0 \lt \alpha \lt 1\);
  2. \({\left( {1 + x} \right)^\alpha } – \alpha x – 1 \ge 0\) for \(\alpha \lt 0\) or \(\alpha \gt 1\).

In the first case (when \(0 \lt \alpha \lt 1\)), the above inequality can be written as

\[{\left( {1 + x} \right)^\alpha } \le \alpha x + 1.\]

This relationship is used to prove other classical inequalities.

In the second case (when \(\alpha \lt 0\) or \(\alpha \gt 1\)), the inequality can be expressed in the form

\[{\left( {1 + x} \right)^\alpha } \ge \alpha x + 1.\]

In a particular case, assuming that \(\alpha\) is a natural number, we obtain the well-known Bernoulli’s inequality:

\[{{\left( {1 + x} \right)^n} \ge 1 + nx,\;\;\;\text{where}}\;\;\;\kern-0.3pt{x \ge – 1,\;n \in \mathbb{N}.}\]
Swiss mathematician Daniel Bernoulli (1700-1782)

Fig.2 Daniel Bernoulli
(1700-1782)

Young’s Inequality

We write again the above inequality

\[{\left( {1 + x} \right)^\alpha } \le \alpha x + 1,\]

which is valid for \(x \ge -1,\) \(0 \lt \alpha \lt 1.\) We introduce the following notation:

\[
{1 + x = \frac{a}{b},}\;\;\;\kern-0.3pt
{\alpha = \frac{1}{p},}\;\;\;\kern-0.3pt
{\frac{1}{q} = 1 – \frac{1}{p}.}
\]

This assumes that \(a \ge 0,\) \(b \gt 0.\) From the condition \(0 \lt \alpha \lt 1\) it also follows that \(p \gt 1.\) Substituting this in our inequality, we have:

\[
{{\left( {\frac{a}{b}} \right)^{\large\frac{1}{p}\normalsize}} \le 1 + \frac{1}{p}\left( {\frac{a}{b} – 1} \right),\;\;}\Rightarrow
{\frac{{{a^{\large\frac{1}{p}\normalsize}}}}{{{b^{\large\frac{1}{p}\normalsize}}}} \le 1 – \frac{1}{p} + \frac{1}{p}\frac{a}{b},\;\;}\Rightarrow
{\frac{{{a^{\large\frac{1}{p}\normalsize}}}}{{{b^{\large\frac{1}{p}\normalsize}}}} \le \frac{1}{q} + \frac{1}{p}\frac{a}{b}.}
\]

Multiply both sides by \(b\) (\(b \gt 0\)). Consequently,

\[
{{a^{\large\frac{1}{p}\normalsize}}{b^{1 – \large\frac{1}{p}\normalsize}} \le \frac{a}{p} + \frac{b}{q},\;\;}\Rightarrow
{{a^{\large\frac{1}{p}\normalsize}}{b^{\large\frac{1}{q}\normalsize}} \le \frac{a}{p} + \frac{b}{q}.}
\]
English mathematician William Henry Young (1863-1942)

Fig.3 William Henry Young
(1863-1942)

We got Young’s inequality.

Redesignating \({a^{\large\frac{1}{p}\normalsize}} \to a,\) \({b^{\large\frac{1}{q}\normalsize}} \to b,\) we can write Young’s inequality in the following form:

\[{ab \le \frac{{{a^p}}}{p} + \frac{{{b^q}}}{q}}\;\;\;\kern-0.3pt{\left( {p \gt 1} \right).}\]

Note that for \(p \lt 1,\) Young’s inequality is written with the opposite sign:

\[{ab \ge \frac{{{a^p}}}{p} + \frac{{{b^q}}}{q}}\;\;\;\kern-0.3pt{\left( {p \lt 1,\,p \ne 0} \right).}\]

Arithmetic-Geometric Mean Inequality

The inequality of arithmetic and geometric means (\(AM-GM\)) for two non-negative numbers follows from Young’s inequality at \(p = q = {\large\frac{1}{2}\normalsize}:\)

\[
{{a^{\large\frac{1}{2}\normalsize}}{b^{\large\frac{1}{2}\normalsize}} \le \frac{a}{2} + \frac{b}{2}}\;\;\kern-0.3pt
{\text{or}\;\;\sqrt {ab} \le \frac{{a + b}}{2}.}
\]

In fact, Young’s inequality can be generalized to the case of \(n\) numbers. Then it takes the following form:

\[
{a_1^{\large\frac{1}{{{p_1}}}\normalsize}a_2^{\large\frac{1}{{{p_2}}}\normalsize} \ldots a_n^{\large\frac{1}{{{p_n}}}\normalsize} }
{\le \frac{{{a_1}}}{{{p_1}}} + \frac{{{a_2}}}{{{p_2}}} + \ldots + \frac{{{a_n}}}{{{p_n}}},}
\]

where

\[
{{a_1},{a_2}, \ldots ,{a_n},{p_1},{p_2}, \ldots ,{p_n} \gt 0,}\;\;\;\kern-0.3pt
{\frac{1}{{{p_1}}} + \frac{1}{{{p_2}}} + \ldots + \frac{1}{{{p_n}}} = 1.}
\]

Letting \({p_1} = {p_2} = \ldots = {p_n} = n,\) we get the following relationship:

\[\sqrt[\large n\normalsize]{{{a_1}{a_2} \cdots {a_n}}} \le \frac{{{a_1} + {a_2} + \ldots + {a_n}}}{n},\]

which means that the geometric mean of \(n\) non-negative numbers is not greater than their arithmetic mean.

Hölder’s Inequality

Consider \(n\) pairs of positive numbers \({x_i},{y_i},\,\left( {i = 1, \ldots ,n} \right).\) If the numbers \(p\) and \(q\) satisfy the condition \({\large\frac{1}{p}\normalsize} + {\large\frac{1}{q}\normalsize} = 1,\) then the following Hölder’s inequality is valid that for \(p \gt 1\) has the form

\[
{\sum\limits_{i = 1}^n {{a_i}{b_i}} }
{\le {\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)^{\large\frac{1}{p}\normalsize}}{\left( {\sum\limits_{i = 1}^n {b_i^q} } \right)^{\large\frac{1}{q}\normalsize}}.}
\]

We prove this relationship. Denote

\[
{A = \sum\limits_{i = 1}^n {a_i^p} ,}\;\;\;\kern-0.3pt
{B = \sum\limits_{i = 1}^n {b_i^q}.}
\]

Then Hölder’s inequality is written as follows:

\[\sum\limits_{i = 1}^n {{a_i}{b_i}} \le {A^{\large\frac{1}{p}\normalsize}}{B^{\large\frac{1}{q}\normalsize}}.\]

Next, we use Young’s inequality in the form

\[{a^{\large\frac{1}{p}\normalsize}}{b^{\large\frac{1}{q}\normalsize}} \le \frac{a}{p} + \frac{b}{q}.\]
German mathematician Otto Hölder (1859-1937)

Fig.4 Otto Hölder
(1859-1937)

Let

\[{a = \frac{{a_i^p}}{A},}\;\;\;\kern-0.3pt{b = \frac{{b_i^q}}{B}.}\]

Applying Young’s inequality to each pair of numbers \({a_i}\) and \({b_i},\) we obtain:

\[\require{cancel}
{\sum\limits_{i = 1}^n {\frac{{{a_i}{b_i}}}{{{A^{\large\frac{1}{p}\normalsize}}{B^{\large\frac{1}{q}\normalsize}}}}} \le \sum\limits_{i = 1}^n {\left( {\frac{{a_i^p}}{{pA}} + \frac{{b_i^q}}{{qB}}} \right)} ,\;\;}\Rightarrow
{\frac{{\sum\limits_{i = 1}^n {{a_i}{b_i}} }}{{{A^{\large\frac{1}{p}\normalsize}}{B^{\large\frac{1}{q}\normalsize}}}} \le \frac{{\sum\limits_{i = 1}^n {a_i^p} }}{{pA}} + \frac{{\sum\limits_{i = 1}^n {b_i^q} }}{{qB}},\;\;}\Rightarrow
{\frac{{\sum\limits_{i = 1}^n {{a_i}{b_i}} }}{{{A^{\large\frac{1}{p}\normalsize}}{B^{\large\frac{1}{q}\normalsize}}}} \le \frac{\cancel{A}}{{p\cancel{A}}} + \frac{\cancel{B}}{{q\cancel{B}}} = \frac{1}{p} + \frac{1}{q} = 1,\;\;}\Rightarrow
{\sum\limits_{i = 1}^n {{a_i}{b_i}} \le {A^{\large\frac{1}{p}\normalsize}}{B^{\large\frac{1}{q}\normalsize}},\;\;}\Rightarrow
{\sum\limits_{i = 1}^n {{a_i}{b_i}} \le {\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)^{\large\frac{1}{p}\normalsize}}{\left( {\sum\limits_{i = 1}^n {b_i^q} } \right)^{\large\frac{1}{q}\normalsize}}\;\;\left( {p > 1} \right).}
\]

Thus, Hölder’s inequality is proved for the case \(p \gt 1.\) At \(p \lt 1\) (\(p \ne 0\)),
this inequality is written with the opposite sign:

\[
{\sum\limits_{i = 1}^n {{a_i}{b_i}} \ge {\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)^{\large\frac{1}{p}\normalsize}}{\left( {\sum\limits_{i = 1}^n {b_i^q} } \right)^{\large\frac{1}{q}\normalsize}}}\;\;\;\kern-0.3pt
{\left( {p \lt 1,\,p \ne 0} \right).}
\]
French mathematician Augustin-Louis Cauchy (1789-1857)

Fig.5 Augustin-Louis Cauchy
(1789-1857)

Cauchy–Schwarz Inequality

There is one more well-known inequality − the Cauchy–Schwarz inequality that can be considered as a special case of Hölder’s inequality when \(p = q = 2.\) It is written in the form

\[\sum\limits_{i = 1}^n {{a_i}{b_i}} \le \sqrt {\sum\limits_{i = 1}^n {a_i^2} } \sqrt {\sum\limits_{i = 1}^n {b_i^2} } .\]

Minkowski’s Inequality

Minkowski’s inequality states that for positive numbers \({a_i}\) and \({b_i}\) the following relation is valid:

\[
{{\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} } \right)^{\large\frac{1}{p}\normalsize}} }\kern0pt
{\le {\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)^{\large\frac{1}{p}\normalsize}} }\kern0pt
{+ {\left( {\sum\limits_{i = 1}^n {b_i^p} } \right)^{\large\frac{1}{p}\normalsize}},}
\]

where \(p \gt 1.\)

This inequality can be also derived from the Hölder’s formula discussed above. We represent the sum in the left-hand side of Minkowski’s inequality as follows:

\[
{\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} }
= {\sum\limits_{i = 1}^n {\left( {{a_i} + {b_i}} \right){{\left( {{a_i} + {b_i}} \right)}^{p – 1}}} }
= {\sum\limits_{i = 1}^n {{a_i}{{\left( {{a_i} + {b_i}} \right)}^{p – 1}}} }
+ {\sum\limits_{i = 1}^n {{b_i}{{\left( {{a_i} + {b_i}} \right)}^{p – 1}}} .}
\]

Apply Hölder’s inequality to each of the sums. Then

\[
{\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} = \sum\limits_{i = 1}^n {{a_i}{{\left( {{a_i} + {b_i}} \right)}^{p – 1}}} }\kern0pt
{+ \sum\limits_{i = 1}^n {{b_i}{{\left( {{a_i} + {b_i}} \right)}^{p – 1}}} }\kern0pt
{\le {\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)^{\large\frac{1}{p}\normalsize}}{\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^{\left( {p – 1} \right)q}}} } \right)^{\large\frac{1}{q}\normalsize}} }\kern0pt
{+ {\left( {\sum\limits_{i = 1}^n {b_i^p} } \right)^{\large\frac{1}{p}\normalsize}}{\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^{\left( {p – 1} \right)q}}} } \right)^{\large\frac{1}{q}\normalsize}} }\kern0pt
= {{\left[ {{{\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)}^{\large\frac{1}{p}\normalsize}} + {{\left( {\sum\limits_{i = 1}^n {b_i^p} } \right)}^{\large\frac{1}{p}\normalsize}}} \right]\cdot}\kern0pt{{{\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^{\left( {p – 1} \right)q}}} } \right)^{\large\frac{1}{q}\normalsize}}}.}}
\]

We must take into account that

\[
{\frac{1}{p} + \frac{1}{q} = 1,\;\;}\Rightarrow
{\frac{1}{q} = 1 – \frac{1}{p} = \frac{{p – 1}}{p},\;\;}\Rightarrow
{q = \frac{p}{{p – 1}}.}
\]

Therefore, the previous expression can be represented as follows:

\[
{\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} }\kern0pt
{\le {\left[ {{{\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)}^{\large\frac{1}{p}\normalsize}} + {{\left( {\sum\limits_{i = 1}^n {b_i^p} } \right)}^{\large\frac{1}{p}\normalsize}}} \right]\cdot}\kern0pt{{\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} } \right)^{\large\frac{1}{q}\normalsize}},}\;\;}\Rightarrow
{\frac{{\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} }}{{{{\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} } \right)}^{\large\frac{1}{q}\normalsize}}}} }\kern0pt
{\le {\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)^{\large\frac{1}{p}\normalsize}} + {\left( {\sum\limits_{i = 1}^n {b_i^p} } \right)^{\large\frac{1}{p}\normalsize}}.}
\]

Since \(1 – {\large\frac{1}{q}\normalsize} = {\large\frac{1}{p}\normalsize,}\) as a result we obtain Minkowski’s inequality:

\[
{{\left( {\sum\limits_{i = 1}^n {{{\left( {{a_i} + {b_i}} \right)}^p}} } \right)^{\large\frac{1}{p}\normalsize}} }\kern0pt
{\le {\left( {\sum\limits_{i = 1}^n {a_i^p} } \right)^{\large\frac{1}{p}\normalsize}} }\kern0pt
{+ {\left( {\sum\limits_{i = 1}^n {b_i^p} } \right)^{\large\frac{1}{p}\normalsize}}.}
\]

Accordingly, in the case of \(p \lt 1\) (\(p \ne 0\)), Minkowski inequality is written with the opposite sign.

German mathematician Hermann Minkowski (1864-1909)

Fig.6 Hermann Minkowski
(1864-1909)

Triangle Inequality

The triangle inequality in the plane follows from Minkowski’s inequality at \(n = 2,\) \(p = 2.\)

The triangle inequality

Figure 7.

Consider a triangle \(ABC\) in the \(xy\)-plane with vertices \(A\left( {{x_A},{y_A}} \right),\) \(B\left( {{x_B},{y_B}} \right),\) and \(C\left( {{x_C},{y_C}} \right)\) (Figure \(7\)). Substituting \(n = 2,\) \(p = 2\) in the Minkowski’s formula, we get:

\[
{\sqrt {\sum\limits_{i = 1}^2 {{{\left( {{a_i} + {b_i}} \right)}^2}} } }\kern0pt
{\le \sqrt {\sum\limits_{i = 1}^2 {a_i^2} } + \sqrt {\sum\limits_{i = 1}^2 {b_i^2} } ,\;\;}\Rightarrow
{\sqrt {{{\left( {{a_1} + {b_1}} \right)}^2} + {{\left( {{a_2} + {b_2}} \right)}^2}} }\kern0pt
{\le \sqrt {a_1^2 + a_2^2} + \sqrt {b_1^2 + b_2^2} .}
\]

Suppose that the numbers \({a_1},{a_2},{b_1},{b_2}\) are expressed through the coordinates of the vertices as follows:

\[
{{a_1} = {x_A} – {x_B},}\;\;\;\kern-0.3pt
{{a_2} = {y_A} – {y_B},}\;\;\;\kern-0.3pt
{{b_1} = {x_B} – {x_C},}\;\;\;\kern-0.3pt
{{b_2} = {y_B} – {y_C}.}
\]

Therefore, we can write:

\[
{\sqrt {{{\left( {{x_A} – \cancel{x_B} + \cancel{x_B} – {x_C}} \right)}^2} + {{\left( {{y_A} – \cancel{y_B} + \cancel{y_B} – {y_C}} \right)}^2}} }\kern0pt
{\le \sqrt {{{\left( {{x_A} – {x_B}} \right)}^2} + {{\left( {{y_A} – {y_B}} \right)}^2}} }\kern0pt
{+ \sqrt {{{\left( {{x_B} – {x_C}} \right)}^2} + {{\left( {{y_B} – {y_C}} \right)}^2}} }
\]

or

\[
{\sqrt {{{\left( {{x_A} – {x_C}} \right)}^2} + {{\left( {{y_A} – {y_C}} \right)}^2}} }\kern0pt
{\le \sqrt {{{\left( {{x_A} – {x_B}} \right)}^2} + {{\left( {{y_A} – {y_B}} \right)}^2}} }\kern0pt
{+ \sqrt {{{\left( {{x_B} – {x_C}} \right)}^2} + {{\left( {{y_B} – {y_C}} \right)}^2}} .}
\]

This inequality is called the triangle inequality and describes the relationship between the lengths of the sides of the triangle:

\[\left| {AC} \right| \le \left| {AB} \right| + \left| {BC} \right|.\]

It means that the length of any side of a triangle does not exceed the sum of the lengths of its two other sides. The equal sign in this case is possible only when the three points lie on one line.

Similarly, the triangle inequality can be obtained from Minkowski’s inequality in three-dimensional Euclidean space. This case occurs when \(n = 3,\) \(p = 2.\)