Chebychev's bound for prime counting functions
We start by proving Chebychev's theorem, which gives surprisingly good bounds compared to the simplicity of the proof. In fact, no analysis is needed in the proof, and the idea of the proof is easy to grasp. Even if one is satisfied with as weak a bound as $\pi(x)>x^{\varepsilon}$ for large $x$ and some $\varepsilon>0$, the easiest way to obtain it is to prove Chebychev's much stronger statement. Another good feature of the proof is that it gives explicit inequalities (the lower bound in the following theorem is not explicit, but that is just because we do not need explicit bounds now). As strong as the prime number theorem is, it is ineffective so it cannot prove that every interval $[n,2n]$ contains a prime number, only that the number of exceptional $n$'s is finite.
Let $\theta(x)=\sum_{p\leq x}\log p$ be the Chebychev function, where $p$ is always a prime variable. By partial summation,
\begin{eqnarray}\pi(x)&=&\sum_{p\leq x}\frac{\log p}{\log p}=\frac{\theta(x)}{\log x}+\int_{2}^x \frac{\theta(t)}{t\log^2 t}dt,\\\theta(x)&=&\sum_{p\leq x}\log p=\pi(x)\log x-\int_{2}^x\frac{\pi(t)}{t}dt,\end{eqnarray}
so the functions $\pi(x)$ and $\theta(x)$ are basically equivalent. In particular,
\begin{eqnarray}\limsup_{x\to \infty}\frac{\theta(x)}{x}=\limsup_{x\to \infty}\frac{\pi(x)\log x}{x}\end{eqnarray}
and the same for $\liminf$. The function $\theta(x)$ is actually easier to deal with (we will see this soon, and also in most proofs of the prime number theorem $\theta(x)$ is considered instead of $\pi(x)$), so the following theorem is formulated for it.
Theorem (Chebychev). We have
\begin{eqnarray}(\log 2+o(1))x\leq \theta(x)\leq (\log 4)x,\end{eqnarray}
where $o(1)$ is something converging to $0$ as $x\to 0$.
Proof. The idea (in this easier proof of Erdös) is to find a quantity that can be analyzed easily without knowledge on primes but is still closely related to them. The choice is the central binomial coefficient $\binom{2n}{n}$, which is relatively small but still divisible by a large number of primes. By the binomial formula,
\begin{eqnarray}4^n=\sum_{k=0}^{2n}\binom{2n}{k},\end{eqnarray}
so
\begin{eqnarray}\frac{4^n}{2n+1}<\binom{2n}{n}<4^n.\end{eqnarray}
On the other hand, $\binom{2n}{n}$ is divisble by all the primes on $(n,2n]$ as they divide $(2n)!$ but not $n!$, so
\begin{eqnarray}\log \prod_{n<p\leq 2n}p=\theta(2n)-\theta(n)\leq \log \binom{2n}{n}\leq (\log 4) n.\end{eqnarray}
Also if $x$ is not an integer,
\begin{eqnarray}\theta(2x)-\theta(x)=\theta(\lfloor 2x\rfloor)-\theta(\lfloor x\rfloor+1)\leq (\log 4)x.\end{eqnarray}
Summing the inequality $\theta(2x)-\theta(x)\leq (\log 4)x$ for $\frac{x}{2},\frac{x}{4},\frac{x}{8},...$ we get a telescopic sum giving
\begin{eqnarray}\theta(x)\leq (\log 4)\left(\frac{x}{2}+\frac{x}{4}+...\right)=(\log 4) x.\\\end{eqnarray}
For the other direction, we must consider the exponents of the primes dividing $\binom{2n}{n}$. Let $v_p(m)$ be the exponent of a prime $p$ in the prime factorization of $m$. We should determine $v_p\left(\binom{2n}{n}\right)=v_p((2n)!)-2v_p(n!)$, so a formula for $v_p(n!)$ would be useful. There is indeed a formula due to Legendre, namely
\begin{eqnarray}v_p(n!)=\sum_{k=1}^{\infty}\left\lfloor \frac{n}{p^k}\right\rfloor\end{eqnarray}
This formula is easy to verify: We have $v_p(n!)=v_p(1)+v_p(2)+...+v_p(n)$, and of the numbers $v_p(m)$ exactly $\left\lfloor \frac{n}{p}\right\rfloor$ are at least $1,$ exactly $\left\lfloor \frac{n}{p^2}\right\rfloor$ are at least $2$, and so on, so the formula follows.
Now it follows
\begin{eqnarray}R_p(n):=v_p\left(\binom{2n}{n}\right)=\sum_{k=1}^{\infty}\left(\left\lfloor\frac{2n}{p^k} \right\rfloor -2\left\lfloor\frac{n}{p^k} \right\rfloor\right).\end{eqnarray}
We make two observations from this formula:
(i) $R_p(n)\leq \frac{\log n}{\log p}$
(ii) $R_p(n)\leq 1$ for $p>\sqrt{2n}$;
both are easy to see as $\lfloor x\rfloor-2\lfloor\frac{x}{2}\rfloor$ is always $0$ or $1$.
Now
\begin{eqnarray}\log \binom{2n}{n}&=&\sum_{p\leq 2n}R_p(n)\log p\\&\leq& \sum_{\sqrt{2n}<p\leq 2n}\log p+\sum_{p\leq \sqrt{2n}}R_p(n)\log n\\&\leq& \theta(2n)+\sqrt{2n}\log n \quad \quad (1).\end{eqnarray}
Therefore, the lower bound for the binomial coefficient gives
\begin{eqnarray}\theta(2n)\geq (\log 4)n-\sqrt{2n}\log n-\log(2n+1),\end{eqnarray}
so $\theta(x)\geq (\log 2+o(1))x.$ ■
Proving that the interval $[n,2n]$ has a prime is now rather easy. Assume that there exists $n$ for which there is no prime on that interval. Then the chain of inequalities $(1)$ becomes stronger since the sum is only over $p<n$, and in fact only over $p<\frac{2n}{3}$ since $R_p(n)=0$ for $p\in (\frac{2n}{3},n]$, and then comparing the resulting inequality with $\theta(n)\leq (\log 4)n$, one gets a contradiction for large $n$ (say $n\geq 1000$), and the small values of $n$ are easy to check. The above sketch is easy to complete.
Three theorems of Mertens
We turn to Mertens' theorems now, which he showed in 1874, before the prime number theorem was first proved (which was in 1896, by Hadamard and de la Vallée Poussin independently). As with Chebychev's theorems, also here the original forms are somewhat stronger (Mertens derived explicit error terms and gave formulas for the involved constants in terms of the zeta function). We will also see that Mertens' theorems imply Chebychev's observation that $\frac{\pi(x)\log x}{x}$ converges to $1$ if it converges to any number.
Theorem (Mertens). We have
\begin{eqnarray}&(i)&\quad \sum_{p\leq x}\frac{\log p}{p}=\log x+O(1),\\&(ii)&\quad \sum_{p\leq x}\frac{1}{p}=\log \log x+M+O\left(\frac{1}{\log x}\right),\\&(iii)&\quad \prod_{p\leq x}\left(1-\frac{1}{p}\right)=\frac{e^{-\gamma}x}{\log x}\left(1+O\left(\frac{1}{\log x}\right)\right),\end{eqnarray}
where as usual $p$ runs through primes, $M$ is a constant (called Mertens' constant), and $\gamma$ is Euler's constant.
It turns out that $(i)$ has an easy elementary proof, and using partial summation, one gets $(ii)$ for free. Using the Taylor series of the logarithm, one also gets $(iii)$ for some constant in place of $e^{-\gamma}$. What is actually the the hardest part of proving Mertens' theorem is showing that the constant is the well-known Euler's constant. That part is based on the behavior of the Riemann zeta function $\zeta(s)$ near $s=1$, but no advanced properties of the function are needed.
Proof of (i). As in Chebychev's theorem, we consider a quantity that we can estimate well without knowing anything about primes, but which is still closely related to them. For this purpose, we investigate the function $n!$. It easy to see by induction that $n^ne^{-n}\leq n!\leq n^n$ for $n\geq 1$ (one may alternatively apply Stirling approximation). On the other hand, using Legendre's formula for $v_p(n!)$, we see that
\begin{eqnarray}n\log n+O(n)&=&\log n!=\sum_{p\leq n}v_p(n!)\log p\\&=&\sum_{p\leq n}v_p(n!)\log p\\&=&n\sum_{p\leq n}\frac{\log p}{p}+O\left(\sum_{p\leq n}\log p\right)+O\left(\sum_{p\leq n}\left(\frac{\log p}{p^2}+\frac{\log p}{p^3}+...\right)\right)\\&=&n\sum_{p\leq n}\frac{\log p}{p}+O(n)+O\left(\sum_{p\leq n}\left(\frac{1}{p^{\frac{3}{2}}}+\frac{1}{p^{\frac{5}{2}}}+...\right)\right)\\&=&n\sum_{p\leq n}\frac{\log p}{p}+O(n)+O\left(\sum_{p\leq n}\frac{1}{p^{\frac{3}{2}}}\right)\quad (**),\end{eqnarray}
and the claim follows by dividing by $n$ as the last sum is bounded as $n\to \infty$. We used the Chebychev bound above and in the end we used $x^2+x^3+...=\frac{x^2}{1-x}\leq 2x^2$ for $|x|\leq \frac{1}{2}$. ■
Proof of (ii). Next we derive $(ii)$. Define $A(x)=\sum_{p\leq x}\frac{\log p}{p}$, and let $A(x)=\log x+R(x)$. Then by partial summation,
\begin{eqnarray}\sum_{p\leq x}\frac{1}{p}&=&\sum_{p\leq x}\frac{\log p}{p}\cdot \frac{1}{\log p}\\&=&\frac{A(x)}{\log x}+\int_{2}^x \frac{A(t)}{t\log^2 t}dt\\&=&1+O\left(\frac{1}{\log x}\right)+\int_{2}^x \frac{1}{t\log t}dt+\int_{2}^x \frac{R(t)}{t\log^2 t}dt\\&=&\log \log x-\log \log 2+1+\int_{2}^{\infty}\frac{R(t)}{t\log^2 t}dt+O\left(\int_{x}^{\infty}\frac{1}{t\log^2t}dt\right)\\&:=&\log \log x+M+O\left(\frac{1}{\log x}\right),\end{eqnarray}
where we used $\frac{d}{dx}\frac{1}{x\log x}=\log \log x$ and $\frac{d}{dx}\frac{1}{x\log^2 x}=-\frac{1}{\log x}$.
Therefore $(ii)$ is proved.
As a corollary, we get $(iii)$ for some constant. Indeed, by the Taylor series of the logarithm,
\begin{eqnarray}\sum_{p\leq x}\log \left(1-\frac{1}{p}\right)&=&-\sum_{p\leq x}\frac{1}{p}-\sum_{p\leq x}\left(\frac{1}{2p^2}+\frac{1}{3p^3}+...\right)\\&=&-\sum_{p\leq x}\frac{1}{p}-\sum_{p}\left(\frac{1}{2p^2}+\frac{1}{3p^3}+...\right)+O\left(\frac{1}{x}\right)\\&=&-\log \log x-M-H+O\left(\frac{1}{\log x}\right),\end{eqnarray}
where $H=\sum_{p}\left(\frac{1}{2p^2}+\frac{1}{3p^3}+...\right)$. Exponentiating, it follows that
\begin{eqnarray}\prod_{p\leq x}\left(1-\frac{1}{p}\right)=\frac{e^{-M-H}}{\log x}\exp\left(O\left(\frac{1}{\log x}\right)\right),\end{eqnarray}
which is $(iii)$ for some constant since $|e^y-1|\leq 2|y|$ when $|y|$ is small. Therefore it just remains to prove that $M+H=\gamma$, but this is not that easy since $M$ and $H$ have no nice closed forms.
The idea to compute the constant in $(iii)$ is to consider instead of $\sum_{p\leq x}\frac{1}{p}$ the prime zeta function $\sum_{p}\frac{1}{p^{1+s}}$ where $s>0$ and $s\to 0+$ subsequently. The advantage of this expression is that it is closely related to the Riemann zeta function, and adding $s$ to all exponents makes many useful expressions convergent. We have
\begin{eqnarray}\sum_{p\leq x}\frac{1}{p^{1+s}}=\sum_{p}\frac{1}{p^{1+s}}-\sum_{p\geq x}\frac{1}{p^{1+s}},\end{eqnarray}
and we should study these two sums as $s\to 0+$. We take the logarithm of the Euler product of the Riemann zeta function
\begin{eqnarray}\zeta(s):=\sum_{n=1}^{\infty}\frac{1}{n^s}=\prod_{p}\left(1-\frac{1}{p^s}\right)^{-1},\quad s>1,\end{eqnarray}
which is just the fundamental theorem of arithmetic written in an analytic form. We find for $s>0$ that
\begin{eqnarray}\log \zeta(1+s)=-\sum_{p}\log \left(1-\frac{1}{p^{1+s}}\right)&=&\sum_{p}\frac{1}{p^{1+s}}+\sum_{p}\left(\frac{1}{2p^{2(1+s)}}+\frac{1}{3p^{3(1+s)}}+...\right)\\&=&\sum_{p}\frac{1}{p^{1+s}}+H+o_s(1),\end{eqnarray}
where $o_s(1)$ is something that approaches zero as $s\to 0$ (it may depend on $x$) and we exploited the uniform convergence of $\sum_{p}\left(\frac{1}{2p^{2(1+s)}}+\frac{1}{3p^{3(1+s)}}+...\right)$. It is well-known that $\zeta(1+s)$ has the Laurent series
\begin{eqnarray}\zeta(1+s)=\frac{1}{s}+\gamma+a_1s+...\end{eqnarray}
around $s=0$. The constant $\gamma$ indeed appears in this formula but at least in the following proof $\gamma$ arises from a different expression. The only thing we need is $\zeta(1+s)=\frac{1}{s}+O(1)$ around $s=0$, and this is easy to prove:
\begin{eqnarray}\zeta(1+s)=\int_{1}^{\infty}\frac{1}{\lfloor t\rfloor^{1+s}}dt&=&\int_{1}^{\infty}\frac{1}{t^{1+s}}dt+\int_{1}^{\infty}\left(\frac{1}{\lfloor t \rfloor^{1+s}}-\frac{1}{t^{1+s}}\right)\\&=&\frac{1}{s}+\int_{1}^{\infty}\left(\frac{1}{t}-\frac{1}{\lfloor t \rfloor}\right)+o_s(1),\end{eqnarray}
where the last integrand is $O(\frac{1}{t^2})$, so that the integral converges and we obtain the claim.
Hence
\begin{eqnarray}\log \zeta(1+s)=\log\frac{1}{s}+o_s(1)=\sum_{p}\frac{1}{p^{1+s}}+H+o_s(1),\end{eqnarray}
that is,
\begin{eqnarray}\sum_{p}\frac{1}{p^{1+s}}=\log \frac{1}{s}-H+o_s(1).\end{eqnarray}
This means that we should prove
\begin{eqnarray}\sum_{p>x}\frac{1}{p^{1+s}}=\log \frac{1}{s}-\log \log x+\gamma+o_s(1)+O\left(\frac{1}{\log x}\right).\end{eqnarray}
We use part (i) of Mertens' theorem, with the notation $A(x)=\sum_{p\leq x}\frac{\log p}{p}$, to see that
\begin{eqnarray}\sum_{p>x}\frac{1}{p^{1+s}}&=&\sum_{p>x}\frac{\log p}{p}\cdot \frac{1}{p^s\log p}\\&=&-\frac{A(x)}{x^s \log x}+\int_{x}^{\infty} A(t) \frac{s\log t +1}{t^{s+1}\log^2 t}dt\\&=&-1+O\left(\frac{1}{\log x}\right)+o_s(1)+\int_{x}^{\infty} A(t) \frac{s\log t+1}{t^{s+1}\log^2t}dt\\&=&-1+O\left(\frac{1}{\log x}\right)+o_s(1)+\int_{x}^{\infty} \frac{s\log t+1}{t^{s+1}\log t}dt+O\left(\int_{x}^{\infty} \frac{s\log t +1}{t^{s+1}\log^2 t}dt\right)\end{eqnarray}
by partial summation. As $s\to 0+$, the error integral approaches $\int_{x}^{\infty}\frac{1}{t\log^2 t}dt=O(\frac{1}{\log x})$.
The main integral can be divided into two pieces; the first one is
\begin{eqnarray}\int_{x}^{\infty}\frac{s}{t^{s+1}}dt&=x^{-s}=1+o_s(1).\end{eqnarray}
Therefore the other part should be $\log \frac{1}{s}-\log \log x+\gamma$ up to negligible error. We evaluate the integral as follows.
\begin{eqnarray}\int_{x}^{\infty}\frac{1}{t^{s+1}\log t}dt&=&\int_{x}^{\infty}\frac{1}{t\log t}t^{-s}dt\\&=&[(\log \log t)t^{-s}]_{x}^{\infty}+s\int_{x}^{\infty}\frac{\log \log t}{t^{s+1}}dt\\&=&-\log \log x+o_s(1)+s\int_{x}^{\infty}\frac{\log \log t}{t^{s+1}}dt\\&=&-\log \log x+o_s(1)+s\int_{\log x}^{\infty}\frac{\log u}{e^{su}}du.\end{eqnarray}
We consider the last integral, which should be $\log \frac{1}{s}+\gamma$ up to small error. We compute
\begin{eqnarray}s\int_{\log x}^{\infty}\frac{\log u}{e^{su}}du&=&s\int_{0}^{\infty}(\log u)e^{-su}du-s\int_{0}^{\log x}(\log u)e^{-su}du\\&=&\int_{0}^{\infty} \log \frac{v}{s}e^{-v}dv-\int_{0}^{s\log x} \log \frac{v}{s}e^{-v}dv\\&=&\int_{0}^{\infty} (\log v)e^{-v}dv+\left(\log\frac{1}{s}\right)\int_{0}^{\infty}e^{-v}dv\\&&\quad -\int_{0}^{s\log x}(\log v)e^{-v}dv-\left(\log\frac{1}{s}\right)\int_{0}^{s\log x}e^{-v}dv \quad (2).\end{eqnarray}
Quite miraculously, we get the right terms as it is well-known that
\begin{eqnarray}\int_{0}^{\infty}(\log x)e^{-x}dx=-\gamma.\end{eqnarray}
The integrand blows up at $x=0$, but the integral is interpreted as the limit of integrals over $[\varepsilon,\infty)$, and the growth at the origin is only logarithmic so that the integral converges. Now $(2)$ becomes
\begin{eqnarray}\gamma+\log \frac{1}{s}+o_s(1)+\left(\log \frac{1}{s}\right)(x^{-s}-1)=\gamma+\log \frac{1}{s}+o_s(1)\end{eqnarray}
as the third integral in $(2)$ is the integral of an integrable function over an interval of length $o_s(1)$. Now $(iii)$ is proved with the right constant. ■
From Mertens' theorem, we can easily deduce Chebychev's observation.
Theorem. If $\lim_{x\to \infty}\frac{\pi(x)\log x}{x}$ exists, its value is $1$.
Proof. Suppose that the value of the limit is $A$. By Chebychev's bounds, $A$ is positive and finite. Given $\varepsilon>0$, let $\frac{\pi(x)\log x}{x}\in [A-\varepsilon,A+\varepsilon]$ for all $x\geq M_{\varepsilon}.$ Then for large $x$, by partial summation,
\begin{eqnarray}\sum_{p\leq x}\frac{\log p}{p}&=&\frac{\pi(x)\log x}{x}+\int_{2}^{x}\pi(t)\frac{\log t-1}{t^2}dt\\&=&O_{\varepsilon}(1)+O\left(\int_{2}^x \frac{1}{t\log t}dt\right)+\int_{M_{\varepsilon}}^x\pi(t)\frac{\log t}{t^2}dt\\&=&O_{\varepsilon}(1)+O(\log \log x)+\int_{M_{\varepsilon}}^{x}\frac{A}{t}+\int_{M_{\varepsilon}}^{x}\frac{\pi(t)\log t-At}{t^2}.\end{eqnarray}
The latter integral is bounded by $\varepsilon \log x$, since $|\pi(t)\log t-At|\leq \varepsilon t$, while the former one is $A\log x+O_{\varepsilon}(1)$. As $\varepsilon>0$ was arbitrary, we get
\begin{eqnarray}\sum_{p\leq x}\frac{\log p}{p}\sim A\log x.\end{eqnarray}
By Mertens' theorem $(i)$, this means $A=1$. ■