High-dimensional space
The Law of Large Numbers
Generating random points in \(d\)-dimensional space using a Gaussian will cause the distance between all pairs of points be the same when \(d\) is large, because the distance
$$|\boldsymbol{y-z}|^2 = \sum_{i=1}^d(y_i-z_i)^2$$
can be viewed as the sum of \(d\) independent samples of a random variable \(x\) that is the squared difference of two Gaussian random variables
$$x_i = (y_i - z_i)^2$$
The Law of Large Numbers states that with high probability the average of the samples will be close to the expectation of the random variable:
$$\Pr\left( |\frac{x_1+x_2+\cdots+x_n}{n} - E(x)| \ge\epsilon \right) \le \frac{Var(x)}{n\epsilon^2}$$
Compare it with the confidence limits for \(\mu\) when \(\sigma\) is known in statistics:
$$\begin{align} \Pr(|\overline X - \mu| \le \epsilon) &= \Pr(\frac{|\overline X - \mu|}{\sigma/\sqrt{n}} \le \frac{\epsilon}{\sigma/\sqrt{n}})\\[2ex] &=2\int_0^{\frac{\epsilon}{\sigma/\sqrt{n}}}\frac{e^{-Z^2/2}}{\sqrt{2\pi}}dZ \end{align}$$
where \(\mu\) is the population mean, \(\sigma\) is the population standard deviation, \(\overline X\) is the sample mean, \(n\) is the sample size (provided it is large enough s.t. \(X\) is normally distributed), \(\epsilon\) is the error, and \(Z\) is the the standard normal variate.
The Law of Large Numbers can be proved by using Chebushev's inequality and a few facts about independent random variables:
$$\begin{align} &E(x+y) = E(x) + E(y)\\ &Var(x-c) = Var(x)\\ &Var(cx) = c^2 Var(x)\\ &E(xy) = E(x) E(y)\\ &Var(x+y) = Var(x) + Var(y) \end{align}$$
Markov's inequality. Let \(x\) be a nonnegative random variable. Then for \(a \gt 0\),
$$\Pr(x \ge a) \le \frac{E(x)}{a}$$
Proof For a continuous nonnegative random variable \(x\) with probability density \(p\),
$$\begin{align} E(x) &= \int_0^\infty xp(x)dx\\[2ex] &= \int_0^a xp(x)dx + \int_a^\infty xp(x)dx\\[2ex] &\ge \int_a^\infty xp(x)dx\\[2ex] &\ge a\int_a^\infty p(x)dx\\[2ex] &= a \Pr(x \ge a) &\blacksquare \end{align}$$
The proof for discrete random variables works the same with sums instead of integrals.
Chebushev's inequality. Let \(x\) be a nonnegative random variable. Then for \(c \gt 0\),
$$\Pr(|x-E(x)| \ge c) \le \frac{Var(x)}{c^2}$$
Proof \(\Pr(|x-E(x)| \ge c) = \Pr(|x-E(x)|^2 \ge c^2)\). Note that \(y=|x-E(x)|^2\) is a nonnegative random variable and \(E(y)=Var(x)\). Applying Markov's inequality,
$$\begin{align} \Pr(|x-E(x)| \ge c) &= \Pr(|x-E(x)|^2 \ge c^2)\\[2ex] &\le \frac{E(|x-E(x)|^2)}{c^2}\\[2ex] &= \frac{Var(x)}{c^2} &\blacksquare \end{align}$$
If we draw two random points \(\boldsymbol{y}\) and \(\boldsymbol{z}\) from a \(d\)-dimensional Gaussian with unit variance in each direction, then \(\boldsymbol{|y|} \approx d\) and \(\boldsymbol{|z|} \approx d\). Since for all \(i\),
$$\begin{align} E(y_i-z_i) &= E(y_i^2) + E(z_i^2) - 2E(y_i z_i)\\[2ex] &= Var(y_i) + Var(z_i) - 2E(y_i)E(z_i)\\[2ex] &= 2 \end{align}$$
then
$$\boldsymbol{|y-z|^2} = \sum_{i=1}^d (y_i-z_i)^2 \approx 2d$$
By the Pythagorean theorem, the random \(d\)-dimensional \(\boldsymbol{y}\) and \(\boldsymbol{z}\) must be approximately orthogonal.
Master Tail Bounds Theorem. Let \(x = x_1 + x_2 + \cdots + x_n,\) where \(x_1, x_2, \ldots, x_n\) are mutually independent random variables with zero mean and variance at most \(\sigma^2\). Let \(0 \le a \le \sqrt{2}n\sigma^2\). Assume that \(|E(x_i^s)| \le \sigma^2s!\) for \(s=3,4,\ldots, \lfloor(a^2/4n\sigma^2)\rfloor\). Then,
$$Pr(|x| \ge a) \le 3e^{-a^2/(12n\sigma^2)}$$
The proof is done by applying Markov's inequality to the random variable \(x^r\) where \(r\) is a large even number. Since \(r\) is nonnegative,
$$Pr(|x| \ge a) = Pr(x^r \ge a^r) \le \frac{E(x^r)}{a^r}$$
Then write \(E(x)\) as \(E(x_1 + \cdots + x_n)^r\) and expand the polynomial by applying the multinomial theorem. Use the fact that by independence \(E(x_i^{r_i} x_j^{r_j}) = E(x_i^{r_i})E(x_j^{r_j})\) to get a collection of simpler expectations that can be bounded using assumption that \(|E(x_i^s)| \le \sigma^2s!\)
The Geometry of High Dimensions
Most of the volume of high-dimensional objects is concentrated near the surface. Consider any object \(A\) in \(R^d\). Shrink \(A\) by a small amount \(\epsilon\) to produce a new object \((1-\epsilon)A=\{(1-\epsilon)x|x\in A\}\). Then,
$$\text{volume}\left((1-\epsilon)A\right)=(1-\epsilon)^d\text{volume}(A)$$
A useful inequality of the exponential function: for all real \(x\),
$$1+x \le e^{x}$$
Proof If \(x \le -1\), the inequality \(1-x \le e^{-x}\) obviously holds. Suppose \(x \ge -1\). Then,
$$\begin{align} e^x &= \sum_{k=0}^\infty \frac{x^k}{k!}\\[2ex] &= 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + \frac{x^4}{4!} + \cdots\\[2ex] &= 1 + x + \sum_{k \ge 1} \left( \frac{x^{2k}}{(2k)!} + \frac{x^{2k+1}}{(2k+1)!} \right)\\[2ex] &= 1 + x + \sum_{k \geq 1} x^{2k}\left( \frac{1}{(2k)!} + \frac{x}{(2k+1)!} \right)\\[2ex] &= 1 + x + \sum_{k \geq 1} x^{2k}\left( \frac{2k + 1 + x}{(2k+1)!} \right)\\[2ex] &\ge 1 + x &\blacksquare \end{align}$$
When we shrink each of the \(2d\) sides of a \(d\)-dimensional cube by a factor \(f\), its volume shrinks by a factor of \(f^d\). For any object \(A\) in \(R^d\),
$$\frac{\text{volume}\left((1-\epsilon)A\right)}{\text{volume(A)}} = (1-\epsilon)^d \le e^{-\epsilon d}$$
Fixing \(\epsilon\) and letting \(d \to \infty\), the above quantity approaches zero. This means that almost all of the volume of \(A\) is concentrated in the portion of \(A\) that does not belong to the region \((1-\epsilon)A\).
Let \(S\) denote a unit ball in \(d\) dimentions. At least \(1-e^{-\epsilon d}\) fraction of the volume of the unit ball is concentrated in a small annulus \(S \setminus (1-\epsilon)S\) of width \(\epsilon\) at the boundary. If the ball is of radius \(r\), then the annulus width is \(O(\frac{r}{d})\).