Friday, October 31, 2014

Goldbach's ternary problem

The ternary Golbach problem is the assertion that every odd integer $N\geq 7$ is the sum of three primes. It was first proved for large enough integers by I. M. Vinogradov in 1937, and the proof is a good starting point for the study of the Hardy-Littlewood circle method and of Vinogradov's method for estimating exponential sums over prime numbers, such as
\[\begin{eqnarray}\sum_{p\leq N}e(\alpha p);\quad e(x):=e^{2\pi i x}.\quad \quad(1)\end{eqnarray}\]
The full ternary problem was solved only in 2013 in a paper by H. Helfgott. In this post, we present the circle method and show how it can be used to resolve the ternary Golbach problem for large enough integers, assuming an estimate for the prime exponential sum $(1)$. This estimate is due to R. C. Vaughan and is based on his refined version of Vinogradov's method. In a later post, we prove this estimate, so the results here serve as an illustration of the capability of Vinogradov's method. We follow Vinogradov's book Method of Trigonometrical Sums in the Theory of Numbers, but replacing the application of Page's theorem by Siegel's theorem allows us to avoid some technicalities and improve the error terms.


When approaching the ternary Goldbach problem, one should first ask is how many representations we expect for an odd integer $N$ as a sum of three primes. As is common in analytic number theory, proofs that show the existence of some quantity (such as the number of representations of $N$ as a sum of three primes) have a tendency to give an asymptotic formula or at least a good lower bound for the quantity, and the case of the three primes problem is not an exception. A simple guess can be made as follows. The number of representations of $N$ as the sum of any three positive integers is easily seen to be
The prime number theorem says intuitively that $n$ is prime with probability $\frac{1}{\log n}$, so that to each representation $n_1+n_2+n_3=N$ of $N$ as a sum of three positive integers we could heuristically assign the probability $\frac{1}{\log n_1 \log n_2 \log n_3}$ that all the summands are primes. Then we would expect the number of ways to write $N$ as a sum of three primes to be approximately
\[\begin{eqnarray}\sum_{n_1+n_2+n_3=N}\frac{1}{\log n_1\log n_2\log n_3}&=&\sum_{n_1+n_2+n_3=N\atop n_i>\frac{N}{\log^{10}N}}\frac{1}{\log n_1\log n_2\log n_3}\\&+&O\left(\sum_{n_1+n_2+n_3=N\atop n_1\leq \frac{N}{\log^{10}N}}\frac{1}{\log n_1\log n_2\log n_3}\right)\\&=&\sum_{n_1+n_2+n_3=N\atop n_i>\frac{N}{\log^{10}N}}\frac{1+o(1)}{\log^3 N}+O\left(\frac{N}{\log^{10}N}\cdot N\right)\\&=&\frac{1}{2}\frac{N^2}{\log^3 N}.\end{eqnarray}\]
because $\log n=(1+o(1))\log N$ for $\frac{N}{\log^{10}N}\leq n\leq N$ and because for any given $n_1$ there are $\ll N$ solutions to $n_1+n_2+n_3=N$. Of course, we neglected all local obstructions for the primes $p_i$ to satisfy $p_1+p_2+p_3=N$, particularly that $N$ must be odd or otherwise there are just $\ll N$ representations. More generally, whenever a prime $p$ divides $N$, we should take into account some congruence conditions for the primes $p_i$ modulo $p$. Nevertheless, it turns out that the simple prediction is correct up to a factor that is bounded from above and below and the factor involves an expression depending on the prime divisors of $N$.

Theorem 1 (Ternary Goldbach for large numbers). Let $N$ be a positive integer. We have
\[\begin{eqnarray}|\{(p_1,p_2,p_3)\in \mathbb{P}^3: N=p_1+p_2+p_3\}|=\frac{1}{2}\mathfrak{S}(N)+O_A\left(\frac{N}{\log^{4}N}\right),\end{eqnarray}\]
where $\mathfrak{S}(N)$ is given by
\[\begin{eqnarray}\mathfrak{S}(N)=\prod_{p}\left(1+\frac{1}{(p-1)^3}\right)\prod_{p\mid N}\left(1-\frac{1}{p^2-3p+3}\right)\end{eqnarray}\]
and satisfies $2.301>\mathfrak{S}(N)>1.320$. Hence every large enough odd integer is the sum of three primes.

We remark that $N$ does not have to be odd in this theorem, but for even $N$ the product is zero, so the estimate becomes much worse than the trivial bound $N$ for representing an even number $N$ as a sum of three primes. Supposing the asymptotic formula, the rest of the proof is trivial because a computation shows that

The circle method

As mentioned, the proof is based on the circle method, which is a tool for estimating the number $r(N)$ of representations of a positive integer as the sum of $k$ elements from $A\subset \mathbb{N}$, where $k$ and $A$ are fixed. The original method of Hardy and Littlewood involves raising the (Taylor) generating function of the set $A$ to the $k$th power and using Cauchy's formula to recover the coefficients $r(N)$ of the exponentiated series. The complex integral is then considered near the unit circle, and the unit circle was divided to major arcs and minor arcs. The major arcs lie near the "rational angles" $e^{2\pi i \frac{a}{q}}$ with small $q$, while the minor arcs consist of everything else. The task is to show that the contribution of the major arcs is dominant, giving a lower bound and often even an asymptotic for $r(N)$. Hardy and Littlewood used their method to give a new proof of Waring's conjecture and to solve the ternary Goldbach problem for large numbers under the generalized Riemann hypothesis.

Vinogradov found a simpler and more effective version of the circle method, avoiding infinite series, complex integrals, and the issues of letting the circles of integration approach the unit circle. We present the circle method only for the three primes problem, but it is plain that the method (but not necessarily the estimates) generalize. To study additive problems in the primes, we define the trigonometric sum
\[\begin{eqnarray}S(N,\alpha)=\sum_{p\leq N}e(\alpha p),\alpha\in \mathbb{R}.\end{eqnarray}\]
Denote by $r_3(N)$ the number of representations of $N$ as a sum of three primes. The basic orthogonality relation from Fourier analysis tells
\[\begin{eqnarray}\int_{0}^1 e(\alpha(m-n))d\alpha=\begin{cases}1,\quad m=n\\ 0, \quad m\neq n,\end{cases}\end{eqnarray}\]
and allows us to write
\[\begin{eqnarray}r_3(N)=\int_{0}^1 S(N,\alpha)^3e(-N\alpha)d\alpha,\end{eqnarray}\]
because the integrand can be written as a sum of terms of the form $e((p_1+p_2+p_3-N)\alpha)$. In this way, the problem of estimating $r_3(N)$ is transformed to a problem in analysis and the theory of trigonometric sums. As in the Hardy-Littlewood approach, a key idea is to divide the domain of integration into finitely many intervals: the major arcs near rational angles $\alpha$ and the minor arcs consisting of the rest.

Definition. Let $A$ and $B$ be fixed large enough numbers and denote $\tau=\frac{N}{\log^A N}$. The major arcs are the set
\[\begin{eqnarray}\mathfrak{M}:=\bigcup_{q\leq \log^B N}\bigcup_{a=1\atop (a,q)=1}^{q}\mathfrak{M}_{q,a},\end{eqnarray}\]
The minor arcs are the complement $\mathfrak{m}:=[0,1]\setminus \mathfrak{M}$.

For the major arcs, the main tool will be an estimate for primes in arithmetic progressions, namely the Siegel-Walfisz theorem.

Theorem 2 (Siegel-Walfisz). For any $M>0$, there exists an (inexplicit) constant $C_M$ such that the number of primes $qn+a$ up to $x$ is
\[\begin{eqnarray}\pi(x;q,a)=\frac{1}{\varphi(q)}Li(x)+O(x\exp(-C_M\sqrt{\log N})),\end{eqnarray}\]
where the error is independent of $q$ for $q\leq \log^M x$ and
\[\begin{eqnarray}Li(x):=\int_{2}^{x}\frac{dt}{\log t}.\end{eqnarray}\]
The Siegel-Walfisz theorem is a uniform version of the prime number theorem in arithmetic progressions. The proof is based on finding zero-free regions for Dirichlet $L$-functions and can be found in Davenport's Multiplicative Number Theory. For the minor arcs, our tool is cancellation occurring in the exponential sum (1) (a precise theorem is formulated later).

We want to show that the major arcs to contribute most of the mass of the integral, while the minor arcs should produce the error term. To motivate this, suppose that $\alpha$ is very close to a rational number with small denominator. Let us say $\alpha=\frac{a}{q}+\varepsilon$ with $(a,q)=1,$ $q\leq \log N,$ $|\varepsilon|<\frac{1}{N\log^{4}N}$. Then
\[\begin{eqnarray}\Re(S(N,\alpha))&=&\sum_{b=1\atop (b,q)=1}^{q} \sum_{p\leq N\atop p\equiv b\hspace{-0.2cm}\pmod q}\cos\left(\frac{2\pi a}{q}\cdot p+\varepsilon p\right)\\&=&\sum_{b=1\atop (b,q)=1}^{q} \sum_{p\leq N}\left(\cos\left(\frac{2\pi ab}{q}\right)+O\left(\frac{1}{N\log^4 N}\right)p\right)\\&=&\sum_{b=1\atop (b,q)=1}^{q} \cos\left(\frac{2\pi ab}{q}\right) \pi(N;q,b)+O\left(\frac{N}{\log^3 N}\right)\\&=&\sum_{b=1\atop (b,q)=1}^{q}\frac{1}{\varphi(q)}Li(N)\cos\left(\frac{2\pi ab}{q}\right) +O\left(\frac{N}{\log^3 N}\right)\quad (*)\\&=&\frac{\mu(q)}{\varphi(q)}Li(N)+O\left(\frac{N}{\log^3 N}\right)\quad (**),\end{eqnarray}\]
where we used the Siegel-Walfisz theorem for $(*)$, whereas for $(**)$ we employed the fact that the primitive roots of unity of order $q$ have sum equal to $\mu(q)$ (and hence their real parts have the same sum). The result has absolute value $\gg \frac{N}{\log^2 N}$. Therefore, $S(N,\alpha)$ is almost as large as it possibly can be when $\alpha$ is this near to a rational number with a small denominator. On the other hand, if $\alpha$ is a "randomly chosen" real number (hence irrational with probability $1$), we expect the terms $e(\alpha p)$ to behave like independent random variables with modulus one and uniform distribution on the unit circle, so that $S(N,\alpha)$ should be only of order $\sqrt{N}$. We consider first the contribution of the major arcs.

Major arcs

We fix an $\alpha=\frac{a}{q}$ with $|z|\leq \tau^{-1}=\frac{N}{\log^A N}$, so $\alpha$ belongs to the a major arc $\mathfrak{M}_{a,q}$. We approximate the sum $S(N,\alpha)$ by discretizing it, that is, splitting the summation into short intervals and using linear approximation on each interval. To this end, let $D=\lfloor\log^{A}N\rfloor$, where $A$ will be chosen later. We approximate
\[\begin{eqnarray}S(N,\alpha)&=&\sum_{k=1}^{D-1}\sum_{\frac{kN}{D}<p\leq \frac{(k+1)N}{D}}e(\alpha p)+O\left(\frac{N}{D}\right)\\&=&\sum_{k=1}^{D-1}\sum_{\frac{kN}{D}<p\leq \frac{(k+1)N}{D}}e\left(\frac{a}{q}\cdot p+zp\right)+O\left(\frac{N}{D}\right)\\&=&\sum_{k=1}^{D-1}\sum_{\frac{kN}{D}<p\leq \frac{(k+1)N}{D}}e\left(\frac{a}{q}\cdot p+z\cdot \frac{kN}{D}\right)+O(zN)+O\left(\frac{N}{D}\right)\quad \text{(*)}\\&=&\sum_{k=1}^{D-1}e\left(z\frac{kN}{D}\right)\sum_{\frac{kN}{D}<p\leq \frac{(k+1)N}{D}}e\left(\frac{a}{q}\cdot p\right)+O\left(\frac{N}{\log^A N}\right)\\&=&\sum_{k=1}^{D-1}e\left(z\frac{kN}{D}\right)\sum_{b=1}^{q}e\left(\frac{ab}{q}\right)\left(\pi\left(\frac{(k+1)N}{D};q,b\right)-\pi\left(\frac{kN}{D};q,b\right)\right)\\&+&O\left(\frac{N}{\log^A N}\right)\\&=&\sum_{b=1\atop (b,q)=1}^{q}\frac{e\left(\frac{ab}{q}\right)}{\varphi(q)}\sum_{k=1}^{D-1}e\left(z\frac{kN}{D}\right)\int_{\frac{kN}{D}}^{\frac{(k+1)N}{D}}\frac{dx}{\log x}+O\left(\frac{N}{\log^A N}\right)\\&+&O(DN\exp(-C\sqrt{\log N}))\quad \text{(**)}\\&=&\sum_{b=1\atop (b,q)=1}^{q}\frac{e\left(\frac{ab}{q}\right)}{\varphi(q)}\sum_{k=1}^{D-1}\int_{\frac{kN}{D}}^{\frac{(k+1)N}{D}}\frac{e(zx)}{\log x}dx+O\left(\frac{N}{\log^A N}\right)\quad \text{(***)}\\&=&\frac{\mu(q)}{\varphi(q)}\int_{2}^{N}\frac{e(zx)}{\log x}+O\left(\frac{N}{\log^A N}\right)\quad \text{(****)}\\&:=&\frac{\mu(q)}{\varphi(q)}J(N,z)+O\left(\frac{N}{\log^A N}\right)\end{eqnarray}\]
Formula $\text{(*)}$ is obtained using $|e(zx)-e(z\frac{kN}{D})|\ll \frac{N}{D}$. For formula $\text{(**)}$, we apply the Siegel-Walfisz theorem and for $\text{(***)}$ we use again $|e(zx)-e(z\frac{kN}{D})|\ll \frac{N}{D}$ so that the error in replacing $e(z\frac{kN}{D})$ by $e(zx)$ everywhere is $\\ \ll D\cdot \frac{N}{D}\cdot \frac{N}{D}\ll N\log^{-A}N.$ Finally, formula $\text{(****)}$ follows from the fact that $e(\frac{ab}{q})$ runs through the primitive roots of unity of order $q$, and we already mentioned the fact that they sum up to $\mu(q)$.

Next we will replace $J(N,z)$ in the major arc integrals by the simpler approximations
\[\begin{eqnarray}I(N,z)=\int_{2}^N \frac{e(zx)}{\log N}dx,\quad \Sigma(N,z):=\sum_{2\leq n\leq N-1}\frac{e(nz)}{\log N},\end{eqnarray}\]
and then evaluate $\int_{0}^1 \Sigma(\alpha)^3e(-N\alpha)d\alpha$ using orthogonality. We need a lemma.

Lemma 3. We have
\[\begin{eqnarray}&(i)&\quad|J(N,z)-I(N,z)|\ll \min\left\{\frac{N}{\log^2 N},\frac{1}{|z|\log N}\right\},\\&(ii)&\quad |I(N,z)|,|J(N,z)|,|\Sigma(N,z)|\ll \min\left\{\frac{N}{\log N},\frac{1}{|z|\log N}\right\}\\&(iii)&\quad |I(N,z)-\Sigma(N,z)|\ll \min\left\{\frac{1}{\log N},\frac{1}{|z|\log N}\right\}.\end{eqnarray}\]
Proof. We have
\[\begin{eqnarray}|J(N,z)-I(N,z)|\ll \int_{2}^N \left(\frac{1}{\log x}-\frac{1}{\log N}\right)\ll \frac{N}{\log^2 N}.\end{eqnarray}\]
\[\begin{eqnarray}I(N,z)=\frac{e(Nz)-e(2z)}{z\log N}\ll \frac{1}{z\log N},\end{eqnarray}\]
and by partial integration we obtain
\[\begin{eqnarray}J(N,z)=\left[\frac{z^{-1}(e(zx)-e(2z))}{\log x}\right]_{2}^N+\int_{2}^{N}\frac{e(zx)-e(2z)}{zx\log^2 x}\ll \frac{1}{z\log N}.\end{eqnarray}\]
By the sum formula for the geomertric series,
\[\begin{eqnarray}\Sigma(N,z)=\frac{e(Nz)-e(2z)}{(e(z)-1)\log N}\ll \frac{1}{z\log N},\end{eqnarray}\]
so also
\[\begin{eqnarray}|I(N,z)-\Sigma(N,z)|\ll \frac{1}{\log N} \left|\frac{1}{2\pi i z}-\frac{1}{e(z)-1}\right|\ll \frac{1}{\log N}.\end{eqnarray}\]
The rest of the estimates are trivial, so we conclude that $(i), (ii)$ and $(iii)$ hold. ■

With the help of the estimates above, we are able to approximate the major arc integrals. We have
\[\begin{eqnarray}\int_{\mathfrak{M}_{q,a}}S(N,\alpha)^3e(-N\alpha)d\alpha&=&e(-\frac{a}{q}N)\int_{0}^{1}S\left(N,\frac{a}{q}+z\right)^3 e(-Nz)dz\\&=&e(-\frac{a}{q}N)\frac{\mu(q)}{\varphi(q)^3}\int_{-\tau^{-1}}^{\tau^{-1}}J(N,z)^3e(-Nz)dz\\&+&O\left(\int_{-\tau^{-1}}^{\tau^{-1}}\left|S\left(N,\frac{a}{q}+z\right)-\frac{\mu(q)}{\varphi(q)}J(N,z)\right|^3dz\right)\\&=&e(-\frac{a}{q}N)\frac{\mu(q)}{\varphi(q)^3}\int_{-\tau^{-1}}^{\tau^{-1}}J(N,z)^3e(-Nz)dz+O\left(\frac{N^2}{\log^{2A}N}\right)\end{eqnarray}\]
applying the approximation for $S(N,\alpha)$ established above. This further equals
\[\begin{eqnarray}&&e(-\frac{a}{q}N)\frac{\mu(q)}{\varphi(q)^3}\int_{-\tau^{-1}}^{\tau^{-1}}I(N,z)^3e(-Nz)dz\\&+&O\left(\frac{1}{\varphi(q)^3}\int_{0}^{\tau^{-1}}|J(N,z)-I(N,z)|^3 dz+\frac{N^2}{\log^{2A}N}\right)\end{eqnarray}\]
because $J(N,-z)=\overline{J(N,z)},I(N,-z)=\overline{I(N,z)}$. By Lemma 3, the error is bounded by
\[\begin{eqnarray}\ll \frac{1}{\varphi(q)^3}\left(\int_{0}^{N^{-1}}\frac{N^3}{\log^{6}N}dz+\int_{N^{-1}}^{\infty}\frac{N}{\log^2 N}\cdot \frac{1}{z^2 \log^2 N}dz\right)\ll \frac{N^2}{\varphi(q)^3\log^4 N}.\end{eqnarray}\]
We may choose $2A=3B+4$ (recall that by the definition of $B$, $q$ is bounded by $\log^B N$) so that the other error $O(N\log^{-2A}N)$ is not larger than this one. We want to replace $I(N,z)$ by $\Sigma(N,z)$, so we write
\[\begin{eqnarray}\int_{\mathfrak{M}_{q,a}}S(N,\alpha)^3e(-N\alpha)d\alpha&=&e(-\frac{a}{q}N)\frac{\mu(q)}{\varphi(q)^3}\int_{-\tau^{-1}}^{\tau^{-1}}\Sigma(N,z)^3e(-Nz)dz\\&+&O\left(\int_{-\tau^{-1}}^{\tau^{-1}}|I(N,z)-\Sigma(N,z)|^3 dz+O\left(\frac{N^2}{\varphi(q)^3\log^4 N}\right)\right),\end{eqnarray}\]
and the error here is
\[\begin{eqnarray}\ll \int_{0}^{N^{-1}}\frac{1}{\log^3 N}dz+\int_{N^{-1}}^{\infty}\frac{1}{z^2\log^3 N}dz\ll N.\end{eqnarray}\]
Lastly, we replace the integral of $\Sigma(N,z)^3e(-Nz)$ over $[-\tau^{-1},\tau^{-1}]$ by the integral of the same function over $[-\frac{1}{2},\frac{1}{2}]$, the error being
\[\begin{eqnarray}\ll \int_{\tau^{-1}}^{\frac{1}{2}}|\Sigma(N,z)^3|\ll \frac{1}{\log^3 N} \int_{\tau^{-1}}^{\frac{1}{2}}z^{-3}dz\ll \frac{\tau^2}{\log^3 N}=\frac{N^2}{\log^{2A+3}}.\end{eqnarray}\]
As $A>2$, this is certainly smaller than $N\log^{-4}N$. In conclusion,
We observe that the integral on the right is exactly $\log^3 N$ times the number of solutions to $N-1=n_1+n_2+n_3$ with $n_1,n_2,n_3\geq 2$, so we have managed to invert the procedure that led to the integral we are considering, but this time the counting problem is trivial. It follows that the integral over the minor arc is
\[\begin{eqnarray}\frac{1}{2}e(-\frac{a}{q})\frac{\mu(q)}{\varphi(q)^3}\cdot \frac{N^2}{\log^3 N}+O\left(\frac{N^2}{\varphi(q)^3\log^{4}N}\right).\end{eqnarray}\]
We note that the major arcs are disjoint because $|\frac{a}{q}-\frac{a}{q'}|\geq \frac{1}{qq'}\geq \frac{1}{\log^{2B}N}>\frac{2}{\tau}$. Thus we may sum over $a$ coprime to $q$ and $q\leq \log^{B} N$, and find that the contribution of all the major arcs is
\[\begin{eqnarray}\sum_{q\leq \log^{B} N}\frac{\mu(q)}{\varphi(q)^3}c_q(-N)\cdot \frac{N^2}{2\log^3 N}+O\left(\frac{N^2}{\log^4 N}\right),\end{eqnarray}\]
\[\begin{eqnarray}c_q(m):=\sum_{a=1\atop (a,q)=1}^{q}e\left(\frac{a}{q}m\right)\end{eqnarray}\]
is called a Ramanujan sum. Notice that it was vital to have $\frac{1}{\varphi(q)^3}$ as a factor in the error term. Since $\frac{1}{\varphi(q)^3}\ll \frac{1}{q^2}$, we may extend the above sum to infinity, so that the major arcs give
\[\begin{eqnarray}\sum_{q=1}^{\infty}\frac{\mu(q)}{\varphi(q)^3}c_q(-N)\cdot \frac{N^2}{2\log^3 N}+O\left(\frac{N^2}{\log^4 N}\right).\end{eqnarray}\]
In the sum over $q$, all the functions are multiplicative because
\[\begin{eqnarray}c_q(n)c_r(n)=\sum_{a=1\atop (a,q)=1}^q \sum_{b=1\atop (b,r)=1}^r e\left(\frac{ar+bq}{qr}\right)=\sum_{k=1\atop (k,q)=1}^{qr}e\left(\frac{kn}{qr}\right)=c_{qr}(n)\end{eqnarray}\]
for coprime $q$ and $r$ because the numbers $ar+bq$ are distinct modulo $qr$ and coprime to $qr$ so that they range exactly through the invertible residue classes modulo $qr$. For prime values of $q$, obviously $c_q(-N)=-1$ if $q\nmid N$ and $c_q(-N)=q-1$ if $q\mid N$. Hence the major arcs give
\[\begin{eqnarray}&&\prod_{p\nmid N}\left(1+\frac{1}{(p-1)^3}\right)\prod_{p\mid N}\left(1-\frac{1}{(p-1)^2}\right)\frac{N^2}{2\log^{3}N}+O\left(\frac{N^2}{\log^4 N}\right)\\&=&\prod_{p}\left(1+\frac{1}{(p-1)^3}\right)\prod_{p\mid N}\left(\frac{1-\frac{1}{(p-1)^2}}{1+\frac{1}{(p-1)^3}}\right)\frac{N^2}{2\log^{3}N}+O\left(\frac{N^2}{\log^4 N}\right),\end{eqnarray}\]
and this is exactly what we want, so the major arc estimates are finished.

Minor arcs

Inside the minor arcs, we no longer expect discretizing to help us, since we must sum over $q$ all the way to $\frac{N}{\log^A N}$ and we do not have the helpful factor $\frac{1}{\varphi(q)^3}$ in the error terms if $q$ grows faster than any power of $\log N$. However, we can proceed in a simpler way, by utilizing the good rational approximation $\frac{a}{q}$ to $\alpha$ to show that $S(N,\alpha)$ is pointwise small. For finding good rational approximations for arbitrary real numbers, we need the following classical theorem of Dirichlet.

Theorem 4 (Dirichlet's approximation theorem). Let $\alpha$ be a real number and $N$ a positive integer. Then there exists a positive integer $q\in [1,N]$ and an integer $a$ coprime to $q$ satisfying
\[\begin{eqnarray}\left|\alpha-\frac{a}{q}\right|\leq \frac{1}{qN}.\end{eqnarray}\]
In particular, there are infinitely many coprime solutions $(a,q)$ to the inequality
\[\begin{eqnarray}\left|\alpha-\frac{a}{q}\right|\leq \frac{1}{q^2}.\end{eqnarray}\]
Proof. It suffices to find $a$ and $q$ which are not necessarily coprime, for if $d$ is their greatest common divisor, then $\frac{a}{d},\frac{q}{d}$ also satisfy the inequalities. Consider the numbers $\|q\alpha\|$ for $q=1,2,...,N$, where $\|\cdot\|$ denotes the distance to the nearest integer. By the pigeonhole principle, we can find $q_1, q_2\in [1,N]$ with $\|q_1\alpha\|-\|q_2\alpha\|\leq \frac{1}{N}$ so that $\||q_1-q_2|\alpha\|\leq \frac{1}{N}$. Hence $|q\alpha-a|\leq \frac{1}{N}$ for some $q$ and $a$, proving the first claim. The second claim of the theorem follows from $\frac{1}{qN}\leq \frac{1}{q^2}$ when we let $N$ grow. ■

Now we can apply the following bound of Vaughan for $S(N,\alpha)$, which will be proved in a later post using Vinogradov's method for estimating exponential sums and Vaughan's identity. Vinogradov also obtained an estimate for the prime expoential sum, but with the crucial difference that his upper bound is at best of size $N\exp(-c\sqrt{\log N})$, while Vaughan's estimate, without being more complicated, manages to save even a power of $N$, giving $\ll N^{\frac{4}{5}}\log^{4}N$ if $\alpha$ is suitably well-behaved (say $\alpha$ has bounded coefficients in its contunued fraction expansion).

Theorem 5 (Vaughan's estimate for prime exponential sums). Let $\alpha$ be a real number with a rational approximation $\frac{a}{q}$ satisfying $(a,q)=1,q\in [1,N]$ and $|\alpha-\frac{a}{q}|\leq \frac{1}{q^2}.$ Then
\[\begin{eqnarray}S(N,\alpha)\ll \log^{3}N\left(\frac{N}{\sqrt{q}}+N^{\frac{4}{5}}+\sqrt{qN}\right).\end{eqnarray}\]

We use this to estimate the contribution of the minor arcs. Let $\alpha$ belong to a minor arc, and choose by Dirichlet's theorem a rational number $\frac{a}{q}$ with $(a,q)=1, q\in [1,\tau]$, and $|\alpha-\frac{a}{q}|\leq \frac{1}{q \tau}\leq \frac{1}{q^2}$. We must have $q>\log^B N$, for otherwise $\alpha$ would lie inside a major arc. Consequently,
\[\begin{eqnarray}S(N,\alpha)\ll N(\log^{3-\frac{B}{2}}N+(\log^3 N)N^{-\frac{1}{5}}+\log^{3-\frac{A}{2}}N)\ll \frac{N}{\log^3 N}\end{eqnarray}\]
for large $B$, say $B=12$ (remember that $A=\frac{3B+2}{2}>B$). Thus
\[\begin{eqnarray}\int_{\mathfrak{m}}S(N,\alpha)^3e(-N\alpha)d\alpha&\ll& \max_{\alpha\in \mathfrak{m}}|S(N,\alpha)|\cdot\int_{0}^{1}|S(N,\alpha)|^2 d\alpha\\&\ll& \frac{N}{\log^{3}N}\int_{0}^{1}|S(N,\alpha)|^2 d\alpha.\end{eqnarray}\]
The integral $\int_{0}^{1}|S(N,\alpha)|^2 d\alpha$ has some important cancellation; by expanding, we see that it counts the solutions to $p_1=p_2$ in prime numbers up to $N$. Hence
\[\begin{eqnarray}\int_{\mathfrak{m}}S(N,\alpha)^3e(-N\alpha)d\alpha&\ll& \frac{N}{\log^{3}N}\pi(N)\ll \frac{N^2}{\log^4 N}.\end{eqnarray}\]
As a remark, the bound $O\left(\frac{N}{\log N}\right)$ for the quadratic average $S(N,\alpha)$ confirms the prediction that $S(N,\alpha)$ is often size less than $\sqrt{N}$, although we are not able to say for which values of $\alpha$ this happens. Combining the major and minor arc estimates yields Theorem 1. ■