Large sieve for exponential sums
The method of the large sieve was developed gradually over decades, starting with Linnik in 1940s, and in the 1960s and 70s by Rényi, Roth, Bombieri, Gallagher, Davenport, Halberstam, Montgomery and Vaughan, and others.
The fundamental tool for deriving the arithmetic large sieve is the analytic large sieve inequality, which says essentially that any exponential sum has square root cancellation when averaged quadratically over all the reduced rationals with denominator at most $z$. The rationals could be replaced by any points that are separated by at least $z^{-2}$, but for our purposes, the rationals are sufficient.
Theorem 1 (Large sieve for exponential sums). Let $a_n$ be any complex numbers, and let
\[\begin{eqnarray}S(x;\alpha):=\sum_{n\leq x}a_ne(n\alpha)\end{eqnarray}\]
be the corresponding exponential sum. For any positive integer $z$ we have
\[\begin{eqnarray}\sum_{q\leq z}\sum_{1\leq a\leq q\atop (a,q)=1}\left|S\left(x;\frac{a}{q}\right)\right|^2\leq (z^2+4\pi x)\sum_{n\leq x}|a_n|^2.\end{eqnarray}\]
The constant $4\pi$ in the theorem could be improved, but we are not concerned with the exact constant. Therefore, we may follow a simple proof due to Gallagher.
We start with a lemma, which can be viewed as a simple Sobolev-type inequality, as it involves the norms of $f$ and $f'$ for any smooth function.
Lemma 2. Let $f$ be a continuously differentiable function from $[0,1]$ to $\mathbb{C}$, extended $1$-periodically to $\mathbb{R}$. For any positive integer $z$, the inequality
\[\begin{eqnarray}\sum_{q\leq z}\sum_{1\leq a \leq q\atop (a,q)=1}\left|f\left(\frac{a}{q}\right)\right|\leq z^2\int_{0}^{1}|f(\alpha)|d\alpha+\int_{0}^{1}|f'(\alpha)|d\alpha.\end{eqnarray}\]
holds.
Proof. We start from the identity
\[\begin{eqnarray}-f\left(\frac{a}{q}\right)=-f(\alpha)+\int_{\frac{a}{q}}^{\alpha}f'(\beta)d\beta,\end{eqnarray}\]
which implies
\[\begin{eqnarray}\left|f\left(\frac{a}{q}\right)\right|\leq |f(\alpha)|+\int_{\frac{a}{q}}^{\alpha}|f'(\beta)|d\beta.\end{eqnarray}\]
For each fraction $\frac{a}{q}$, we integrate the above inequality over $I_{a,q}:=[\frac{a}{q}-\frac{1}{2z^2},\frac{a}{q}+\frac{1}{2z^2}]$ to obtain
\[\begin{eqnarray}\frac{1}{z^2}\left|f\left(\frac{a}{q}\right)\right|&\leq& \int_{I_{a,q}}|f(\alpha)|d\alpha+\int_{I_{a,q}}\int_{I_{a,q}}|f'(\beta)|d\beta d\alpha\\&=&\int_{I_{a,q}}|f(\alpha)|d\alpha+\frac{1}{z^2}\int_{I_{a,q}}|f'(\beta)|d\beta.\end{eqnarray}\]
The intervals $I_{a,q}$ are disjoint, since any two distinct fractions $\frac{a}{q},\frac{a'}{q'}$ with $q,q'\leq z$ satisfy $\left|\frac{a}{q}-\frac{a'}{q'}\right|\geq \frac{1}{qq'}> \frac{1}{z^2}$. Therefore summing over all reduced fractions with denominator at most $z$ produces
\[\begin{eqnarray}\sum_{q\leq z}\sum_{1\leq a \leq q\atop (a,q)=1}\left|f\left(\frac{a}{q}\right)\right|\leq z^2\int_{0}^{1}|f(\alpha)|d\alpha+\int_{0}^{1}|f'(\alpha)|d\alpha,\end{eqnarray}\]
as wanted. ■
We can now prove the large sieve for exponential sums.
Proof of Theorem 1. We choose $f(\alpha)=S(x;\alpha)^2$ in Lemma 2; this gives
\[\begin{eqnarray}\sum_{q\leq z}\sum_{1\leq a \leq q\atop (a,q)=1}\left|S\left(x;\frac{a}{q}\right)\right|^2&\leq& z^2\int_{0}^{1}|S(x;\alpha)|^2d\alpha+2\int_{0}^{1}|S(x;\alpha)S'(x;\alpha)|d\alpha\\&\leq& z^2\int_{0}^{1}|S(x;\alpha)|^2d\alpha+2\left(\int_{0}^{1}|S(x;\alpha)|^2d\alpha \int_{0}^{1}|S'(x;\alpha)|^2d\alpha\right)^{\frac{1}{2}},\end{eqnarray}\]
where Cauchy-Schwarz was applied. By Parseval's theorem, the first integral on the right is $\sum_{n\leq x}|a_n|^2$, while the integral of $|S'(x;\alpha)|^2$ is
\begin{eqnarray}\sum_{n\leq x}4\pi^2 n|a_n|^2\leq 4\pi^2 x^2\sum_{n\leq x}|a_n|^2,\end{eqnarray} again by Parseval's theorem. The theorem follows by putting these estimates together. ■
\begin{eqnarray}\sum_{n\leq x}4\pi^2 n|a_n|^2\leq 4\pi^2 x^2\sum_{n\leq x}|a_n|^2,\end{eqnarray} again by Parseval's theorem. The theorem follows by putting these estimates together. ■
Arithmetic large sieve
Before proving the arithmetic large sieve, we derive a couple of simple formulas for Ramanujan sums that will be utilized in the proof. A Ramanujan sum is defined as
\[\begin{eqnarray}c_q(n):=\sum_{1\leq a\leq q\atop (a,q)=1}e\left(\frac{an}{q}\right).\end{eqnarray}\]
Lemma 3. $(i)$ For $(q_1,q_2)=1$, we have $c_{q_1q_2}(n)=c_{q_1}(n)c_{q_2}(n)$,
$(ii)$ For any $q$, $c_q(n)=\sum_{d\mid(q,n)}d\cdot\mu\left(\frac{q}{d}\right)$,
$(iii)$ For $(q,n)=1$, we have $c_q(n)=\mu(q)$.
Proof. $(i)$ We compute
\[\begin{eqnarray}c_{q_1}(n)c_{q_2}(n)=\sum_{1\leq a\leq q_1\atop (a,q_1)=1}\sum_{1\leq b\leq q_2\atop (b,q_2)=1}e\left(n\cdot \frac{aq_2+bq_1}{q_1q_2}\right)\end{eqnarray}\]
As $a$ and $b$ range through the ranges in the double sum, $aq_2+bq_1$ ranges through the residue classes $\pmod{q_1q_2}$ that are coprime to $q_1q_2$. This proves the claim.
$(ii)$ Let $1_{k\mathbb{Z}}(n)=1$ if $k\mid n$ and $0$ otherwise. Then the claim is
\[\begin{eqnarray}c_q(n)=\sum_{d\mid q}\mu\left(\frac{q}{n}\right)\cdot d1_{d\mathbb{Z}}(n).\end{eqnarray}\]
By Möbius inversion, this is equivalent to
\[\begin{eqnarray}m1_{m\mathbb{Z}}(n)=\sum_{d\mid m}c_q(n),\end{eqnarray}\]
and this is true, because
\[\begin{eqnarray}m1_{m\mathbb{Z}}(n)&=&\sum_{1\leq a\leq m}e\left(\frac{an}{m}\right)\\&=&\sum_{d\mid m}\sum_{1\leq a\leq m\atop(a,m)=d}e\left(\frac{an}{m}\right)\\&=&\sum_{d\mid m}\sum_{1\leq a\leq \frac{m}{d}\atop(a,\frac{m}{d})=1}e\left(\frac{adn}{m}\right)\\&=&\sum_{d\mid m}c_{\frac{m}{d}}(n),\end{eqnarray}\]
so this part is proved.
$(iii)$ This part follows directly from $(ii)$. ■
Theorem 4 (Arithmetic large sieve). Let $A\subset [1,x]$ be a set of integers and $P$ a set of primes such that $A$ avoids $\omega(p)$ residue classes $\pmod p$ for each $p\in P$. Then for any positive integer $z$ we have
\[\begin{eqnarray}S(A,P,z)\leq \frac{z^2+4\pi x}{L(z)},\end{eqnarray}\]
where
\[\begin{eqnarray}L(z):=\sum_{q\leq z}\mu^2(q)\prod_{p\mid q}\frac{\omega(p)}{p-\omega(p)}.\end{eqnarray}\]
Proof. Let $w_{i,q}$ for $i=1,...,\omega(p)$ be the residue classes that $A$ avoids modulo $p$. By the Chinese remainder theorem, $A$ avoids $\omega(q)$ residue classes for all $p\mid P(z):=\prod_{p\leq z\atop p \in P}p$, if $\omega$ is extended multiplicatively to these numbers. Call the residue class $\pmod q$ arising from $w_{i_1,p_1}\pmod{p_1},...,w_{i_r,p_r}\pmod{p_r}$ by $w_{I,q}$, where $I=(i_1,...,i_r)$ and $p_1,...,p_r$ are the prime divisors of $q$. Lemma 3 tells us that the condition $(n-w_{I,q},P(z))=1$ for $n\in S(A,P,z)$ can be expressed as
\[\begin{eqnarray}\mu(q)=c_q(n-w_{I,q}).\end{eqnarray}\]
We sum this equality over all $I$ and $n\in S(A,P,z)$; we find
\[\begin{eqnarray}\mu(q)\omega(q)S(A,P,z)=\sum_{1\leq a\leq q\atop (a,q)=1}\sum_{I}e\left(\frac{-aw_{I,q}}{q}\right)\sum_{n\in S(A,P,Z)}e\left(\frac{an}{q}\right).\end{eqnarray}\]
We square both sides and apply Cauchy-Schwarz to deduce
\[\begin{eqnarray}|\mu(q)\omega(q)S(A,P,z)|^2\leq \sum_{1\leq a\leq q\atop (a,q)=1}\left|\sum_{I}e\left(\frac{-aw_{I,q}}{q}\right)\right|^2\sum_{1\leq a\leq q\atop (a,q)=1}\left|\sum_{n\in S(A,P,Z)}e\left(\frac{an}{q}\right)\right|^2.\end{eqnarray}\quad (1)\]
The first sum on the right can be computed as follows.
\[\begin{eqnarray}\sum_{1\leq a\leq q\atop (a,q)=1}\left|\sum_{I}e\left(\frac{-aw_{I,q}}{q}\right)\right|^2&=&\sum_{1\leq a\leq q\atop (a,q)=1}\sum_{I,I'}e\left(\frac{a(w_{I',q}-w_{I,q})}{q}\right)\\&=&\sum_{I,I'}\sum_{d\mid (q,w_{I',q}-w_{I,q})}\mu\left(\frac{q}{d}\right)\\&=&\sum_{d\mid q}\sum_{I,I'\atop w_{I,q}\equiv w_{I,q}\hspace{-0.15cm}\pmod d}\mu\left(\frac{q}{d}\right)d\\&=&\sum_{d\mid q}\mu\left(\frac{q}{d}\right)d\omega(q)\omega\left(\frac{q}{d}\right)\\&=&q\omega(q)\sum_{e\mid q}\frac{\mu(e)\omega(e)}{e}\\&=&q\omega(q)\prod_{p\mid q}\left(1-\frac{\omega(p)}{p}\right)\\&=&\omega(q)\prod_{p\mid q}(p-\omega(p)).\end{eqnarray}\]
Now $(1)$ can be rewritten as
\[\begin{eqnarray}\mu(q)^2\prod_{p\mid q}\frac{\omega(p)}{p-\omega(p)}S(A,P,z)^2\leq\sum_{1\leq a\leq q\atop (a,q)=1}\left|\sum_{n\in S(A,P,Z)}e\left(\frac{an}{q}\right)\right|^2.\end{eqnarray}\]
When this is summed over $q\leq z$, the right-hand side is precisely of the form required by the large sieve inequality; take $a_n=1$ if $n\in S(A,P;z)$ and $a_n=0$ otherwise. Hence the large sieve inequality for exponential sums gives
\[\begin{eqnarray}\sum_{q\leq z}\mu(q)^2\prod_{p\mid q}\frac{\omega(p)}{p-\omega(p)}S(A,P,z)^2\leq (z^2+4\pi x)S(A,P,z),\end{eqnarray}\]
so the proof is complete. ■
As already mentioned, the large sieve is most useful when the function $\omega(p)$ defined above is large, at least on average; this is contrary to the ''small sieves'' such as the Brun sieve and Selberg sieve. One sample of the power of the large sieve is its application to ''square-mimicking'' sets. We say that a set $A\subset [1,x]$ of integers is square-mimicking if the image $A\mod p$ has at most $\frac{p+1}{2}$ elements for each odd prime $p$. The name obviously comes from the fact that the set of perfect squares up to $x$ has this property. There are many other such sets as well, and we are interested in the counting function $A(x)$ of elements of $A$ up to $x$. The large sieve gives us, with the choice $\omega(p)=\frac{p-1}{2}$,
\[\begin{eqnarray}A(x)&\leq& \frac{z^2+4\pi x}{\sum_{q\leq z}\mu(q)^2\prod_{p\mid q}\left(1-\frac{2}{p+1}\right)}\end{eqnarray}\]
for any positive integer $z$. We have
\[\begin{eqnarray}\prod_{p\mid q}\left(1-\frac{2}{p+1}\right)&\geq& \prod_{p\mid q}\left(1-\frac{2}{p}\right)\\&\geq& \prod_{p\mid q}\left(1-\frac{1}{p}\right)^2\left(1-\frac{3}{p^2}\right)\\&\geq& c\frac{\varphi(q)^2}{q^2}.\end{eqnarray}\]
Now Cauchy-Schwarz gives
\[\begin{eqnarray}\sum_{q\leq z}\mu(q)^2\prod_{p\mid q}\left(1-\frac{2}{p+1}\right)&\gg& \sum_{q\leq z}\mu(q)^2\frac{\varphi(q)^2}{q^2}\\&\geq& \frac{\left(\sum_{q\leq z}|\mu(q)|\frac{\varphi(q)}{q}\right)^2}{z}.\end{eqnarray}\]
We want to show that the last expression is $\gg z$, and partial summation tells that for this it is sufficient to show
\[\begin{eqnarray}\sum_{q\leq z}\mu^2(q)\varphi(q)\gg z.\quad (2)\end{eqnarray}\]
Using the identity $\varphi=\mu\cdot \text{id}$ and a change of the order of summation, we see that
\[\begin{eqnarray}\sum_{q\leq x}\varphi(q)\sim Cx\end{eqnarray}\]
for some $C>0$. In particular, there are at most $\left(\frac{1}{10}+o(1)\right)x$ integers $q\leq x$ with $\varphi(q)<\frac{q}{100}$. Consequently, there are at least $\left(\frac{6}{\pi^2}-\frac{1}{10}+o(1)\right)x$ square-free integers $q\leq x$ with $\varphi(q)\geq \frac{q}{100}$, and this certainly implies $(2)$.
Hence $A(x)\ll \frac{z^2+x}{z}=z+\frac{x}{z}$, and the choice $z=\sqrt{x}$ gives $A(x)\ll \sqrt{x}$, which is optimal up to a constant factor.
Bilinear large sieve
We present one more formulation of the large sieve, this time with the application to the Bombieri-Vinogradov theorem in mind.
Lemma 5. Let $a_n$ be complex numbers and $x$ and $z$ positive integers. Then it holds that
\[\begin{eqnarray}\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-0.2cm}\pmod q}^{*}\left|\sum_{n\leq x}a_n\chi(n)\right|^2\leq (z^2+4\pi x)\sum_{n\leq x}|a_n|^2,\quad (3)\end{eqnarray}\]
where * indicates that the summation is over the primitive characters.
Proof. Let $\chi$ be a primitive character $\hspace{-0.2cm}\pmod q$. We apply the discrete Fourier transform formula for $\chi(n)$, which says that
\[\begin{eqnarray}\chi(n)=\frac{1}{\tau(\chi)}\sum_{a=1}^{q}\bar{\chi}(a)e\left(\frac{an}{q}\right),\quad (n,q)=1,\end{eqnarray}\]
with $\tau(\chi)$ a Gauss sum, and $|\tau(\chi)|=\sqrt{q}$ (this formula was proved in this post). It follows that
\[\begin{eqnarray}\sum^*_{\chi\hspace{-0.2cm}\pmod q}\left|\sum_{n\leq x}a_n\chi(n)\right|^2&\leq& \frac{1}{q}\sum^*_{\chi\hspace{-0.2cm}\pmod q}\left|\sum_{a=1}^q \bar{\chi}(a)\sum_{n\leq x}a_ne\left(\frac{an}{q}\right)\right|^2\\&=&\frac{1}{q}\sum_{\chi\hspace{-0.2cm}\pmod q}\bar{\chi}(a)\chi(b)\sum_{a=1}^q a_ne\left(\frac{an}{q}\right)\sum_{b=1}^q \overline{a_ne\left(\frac{bn}{q}\right)}.\end{eqnarray}\]
The sum over characters is $\varphi(q)$ if $a\equiv b \pmod q$ and $0$ otherwise. Hence
\[\begin{eqnarray}\frac{q}{\varphi(q)}\sum^*_{\chi\hspace{-0.2cm}\pmod q}\left|\sum_{n\leq x}a_n\chi(n)\right|^2&\leq& \sum_{a=1\atop (a,q)=1}^{q}\left|\sum_{n\leq x}a_ne\left(\frac{an}{q}\right)\right|^2,\end{eqnarray}\]
and this is at most the right-hand side of $(3)$, due to the large sieve inequality. ■
We can immediately deduce a bilinear large sieve inequality, which will be an intermediate step in proving another such inequality. The other bilinear inequality is used to prove the Bombieri-Vinogradov theorem.
Corollary. For any sequences $a_n,b_n$ of complex numbers and positive integers $x,y$ and $z$, we have the inequality
\[\begin{eqnarray}&&\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-0.2cm}\pmod q}^{*}\left|\sum_{n\leq x}\sum_{m\leq y}a_nb_m\chi(mn)\right|\\&\leq& (z^2+4\pi x)^{\frac{1}{2}}(z^2+4\pi y)^{\frac{1}{2}}\left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}}.\end{eqnarray}\]
Proof. The statement follows from Lemma 5 and the Cauchy-Schwarz inequality. ■
Lemma 6 (Bilinear large sieve inequality). With the assumptions of the previous corollary,
\[\begin{eqnarray}&&\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-0.2cm}\pmod q}^{*}\left|\sum_{\substack{n\leq x\\m\leq y\\nm\leq u}}a_nb_m\chi(mn)\right|\\&\ll& (z^2+4\pi x)^{\frac{1}{2}}(z^2+4\pi y)^{\frac{1}{2}}\left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}}\log(2xy)\end{eqnarray}\]
for any real number $u$, with an absolute implied constant.
Proof. We have the formula
\[\begin{eqnarray}\frac{1}{\pi}\int_{-T}^{T}e^{i\alpha t}\frac{\sin(\beta t)}{t}dt=\begin{cases}1+O\left(\frac{1}{T(\beta-|\alpha|)}\right),\quad|\alpha|<\beta\\ \frac{1}{2},\quad |\alpha|=\beta\\O\left(\frac{1}{T(|\alpha-\beta|)}\right)\end{cases}\quad (4)\end{eqnarray}\]
that allows us to express the condition $mn\leq u$ in analytic terms. One can prove this formula for example by noticing that the Fourier transform of $1_{[-1,1]}(x)$ is $\frac{\sin(2\pi \xi)}{\pi \xi}$ and using Fourier inversion for $L^2$ functions, and lastly using partial integration to see that the integral in $(4)$ differs from the integral over $\mathbb{R}$ by $\ll \frac{1}{||\alpha|-\beta|T}$. Setting $\alpha=\log(mn),\beta=\log u$, and taking $u$ to be a half-integer, we have
\[\begin{eqnarray}\sum_{\substack{n\leq x\\m\leq y\\nm\leq u}}a_nb_m\chi(mn)\ll \left|\int_{-T}^{T}A(t,\chi)B(t,\chi)\frac{\sin(t\log u)}{t}dt\right|+\frac{1}{T}\sum_{n\leq x\atop m\leq y}\frac{|a_nb_m|}{|\log \frac{nm}{u}|},\end{eqnarray}\]
where
\[\begin{eqnarray}A(t,\chi)=\sum_{n\leq x}\frac{a_n\chi(n)}{n^{it}},\quad B(t,\chi)=\sum_{m\leq y}\frac{b_m\chi(m)}{m^{it}}.\end{eqnarray}\]
Due to $|\log\frac{mn}{u}|\gg \log \frac{u+\frac{1}{2}}{u}\gg \frac{1}{u}\gg \frac{1}{xy}$ and $\frac{\sin ax}{x}\ll \min\{1,a\}$, we deduce
\[\begin{eqnarray}&&\sum_{\substack{n\leq x\\m\leq y\\nm\leq u}}a_nb_m\chi(mn)\\&\ll&\left|\int_{-T}^{T}A(t,\chi)B(t,\chi)\min\left\{\frac{1}{t},\log(2xy)\right\}dt\right|+\frac{xy}{T}\sum_{n\leq x \atop m\leq y}|a_nb_m|\\&\ll& \left|\int_{-T}^{T}A(t,\chi)B(t,\chi)\min\left\{\frac{1}{t},\log(2xy)\right\}dt\right|+\frac{(xy)^{\frac{3}{2}}}{T}\left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}}\end{eqnarray}\]
by the Cauchy-Schwarz inequality. We sum over $q$ and $\chi$ to arrive at
\[\begin{eqnarray}&&\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-0.2cm}\pmod q}^{*}\left|\sum_{\substack{n\leq x\\m\leq y\\nm\leq u}}a_nb_m\chi(mn)\right|\\&\ll& \sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-0.2cm}\pmod q}^{*}\left|\int_{-T}^{T}A(t,\chi)B(t,\chi)\min\left\{\frac{1}{t},\log(2xy)\right\}dt\right|\\&&+\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-0.2cm}\pmod q}^{*}\frac{(xy)^{\frac{3}{2}}}{T}\left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}}.\quad (5)\end{eqnarray}\]
The first sum is at most
\[\begin{eqnarray}\int_{-T}^{T}\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-0.2cm}\pmod q}^{*}\left|\sum_{n\leq x\atop m\leq y}\frac{a_nb_m\chi(mn)}{(mn)^{it}}\right|\cdot \min\left\{\frac{1}{t},\log(2xy)\right\}dt,\end{eqnarray}\]
and an application of the inequality of the previous corollary shows that this sum is further bounded by
\[\begin{eqnarray}&\ll& (z^2+4\pi x)^{\frac{1}{2}}(z^2+4\pi y)^{\frac{1}{2}}\left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}}\cdot\int_{-T}^{T}\min\left\{\frac{1}{t},\log(2xy)\right\}dt\\&\ll& (z^2+4\pi x)^{\frac{1}{2}}(z^2+4\pi y)^{\frac{1}{2}}\left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}}\log(2xy)\end{eqnarray}\]
if $T=(xy)^c$ for a fixed $c$. The second sum in $(5)$ is in turn
\[\begin{eqnarray}&&\frac{(xy)^{\frac{3}{2}}}{T}\sum_{q\leq z}\frac{q}{\varphi(q)}\cdot \varphi(q)\cdot \left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}}\\&\ll& \frac{(xy)^{\frac{3}{2}}z^2}{T} \left(\sum_{n\leq x}|a_n|^2\right)^{\frac{1}{2}}\left(\sum_{m\leq y}|b_m|^2\right)^{\frac{1}{2}},\end{eqnarray}\]
and now we choose $T=(xy)^{\frac{3}{2}}$, obtaining the statement. ■
The Bombieri-Vinogradov theorem
We preent next the celebrated Bombieri-Vinogradov theorem, proved independently by Bombieri and Vinogradov in 1965. The proof that we follow is due to Vaughan.
Theorem 7 (Bombieri-Vinogradov theorem). Let $A>0$ be given. Then there exists $B>0$ such that
\[\begin{eqnarray}\sum_{q\leq \frac{\sqrt{x}}{\log^B x}}\max_{y\leq x}\,\max_{a\in \mathbb{Z}_{q}^{\times}}\left|\psi(x;q,a)-\frac{x}{\varphi(q)}\right|\ll \frac{x}{\log^A x},\end{eqnarray}\]
and one can actually take $B=A+6.$
The significance of the bound is that it gives the same result that would follow from the generalized Riemann hypothesis; namely, one would then have $|\psi(x;a,q)-\frac{x}{\varphi(q)}|\ll \sqrt{x}\log^2 x$ for all $x$ and all $a$ coprime to $q$, which would be Theorem 7 with $B=A+2$. In other words, Theorem 7 is vaguely speaking the GRH on average. Another reason for the importance of the theorem is that sums like this arise often in sieve theory, when one is sifting in the set of prime numbers, and there are also other contexts where these sums arise naturally. Despite the strength of the Bombieri-Vinogradov theorem, much more ought to be true. The Elliot-Halberstam conjecture asserts that one can replace $\frac{\sqrt{x}}{\log^B x}$ by $x^{1-\varepsilon}$ for any $\varepsilon>0$. So far this has not been done for any $\varepsilon<\frac{1}{2}$, but Zhang proved in 2013 a certain smooth modulus version of the case $\varepsilon=\frac{1}{584}$, and passing the square root barrier enabled showing bounded gaps in primes. Before we prove the Bombieri-Vinogradov theorem, we present a lemma, which is the crucial part in proving the theorem.
Lemma 8. For any positive integers $x$ and $z$, we have
\[\begin{eqnarray}\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\max_{y\leq x}\left|\sum_{n\leq y}\chi(n)\Lambda(n)\right|\ll (z^2x^{\frac{1}{2}}+x+zx^{\frac{5}{6}})(\log x)^{5}.\end{eqnarray}\]
Proof. We use Vaughan's identity, derived in this post, to write
\[\begin{eqnarray}\Lambda(n)=\Lambda(n)1_{[1,U]}(n)+\sum_{ab=n\atop b\leq V}(\log a)\mu(b)-\sum_{\substack{abc=n\\b\leq V\\c\leq U}} \mu(b)\Lambda(c)-\sum_{\substack{ab=n\\a>U\\b>V}}\Lambda(a)\sum_{cd=b\atop d\leq V}\mu(d),\end{eqnarray}\]
where $U$ and $V$ are to be chosen later. The first term in Vaughan's identity contributes
\[\begin{eqnarray}&&\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\max_{y\leq x}\left|\sum_{n\leq \min\{y,U\}}\chi(n)\Lambda(n)\right|\\&\leq& \sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\sum_{n\leq U}\Lambda(n)\\&\ll& U\sum_{q\leq z}q\ll z^2 U.\end{eqnarray}\]
For the second term, we compute as follows.
\[\begin{eqnarray}\sum_{n\leq y}\chi(n)\sum_{ab=n\atop b\leq V}(\log a)\mu(b)&=&\sum_{n\leq y}\chi(n)\sum_{ab=n\atop b\leq V}\mu(b)\int_{1}^{y}\frac{1_{[1,a](t)}}{t}dt\\&=&\int_{1}^{y}\sum_{n\leq y}\chi(n)\sum_{ab=n\atop b\leq V}\mu(b)1_{[1,a]}(t)\frac{dt}{t}\\&=&\int_{1}^{y}\sum_{ b\leq V}\mu(b)\sum_{a\leq\frac{y}{b}}1_{[1,a]}(t)\chi(a)\frac{dt}{t}\\&=&\int_{1}^{y}\sum_{ b\leq V}\mu(b)\sum_{t\leq a\leq\frac{y}{b}}\chi(a)\frac{dt}{t}\\&\ll& \int_{1}^{y}\sum_{b\leq V}\sqrt{q}\log q\frac{dt}{t}\\&\ll& V\sqrt{q}(\log q)(\log y)\end{eqnarray}\]
by the Pólya-Vinogradov inequality. This results in
\[\begin{eqnarray}&&\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\max_{y\leq x}\left|\sum_{n\leq \min\{y,U\}}\chi(n)\sum_{ab=n}(\log a)\mu(b)\right|\\&\ll& \sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}V\sqrt{q}(\log q)(\log x)\\&\ll& \sum_{q\leq z}Vq^{\frac{3}{2}}(\log q)(\log x)\\&\ll& z^{\frac{5}{2}}V(\log x)(\log z).\end{eqnarray}\]
In the third term of Vaughan's identity, we consider separately the contribution of
\[\begin{eqnarray}S_1:=\sum_{\substack{abc=n\\ b\leq V,c\leq U\\bc\leq U}}\mu(b)\Lambda(c)\quad \text{and}\quad S_2:=\sum_{\substack{abc=n\\ b\leq V,c\leq U\\U<bc\leq UV}}\mu(b)\Lambda(c).\end{eqnarray}\]
For the first sum, we proceed using Pólya-Vinogradov:
\[\begin{eqnarray}\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\max_{y\leq x}\left|\sum_{bc\leq U}\mu(b)\Lambda(c)\sum_{a\leq \frac{y}{bc}}\chi(abc)\right|&\ll& U(\log U)\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\sqrt{q}\log q\\&\ll& z^{\frac{5}{2}}U(\log U)(\log z).\end{eqnarray}\]
For the other sum, we are going to use the bilinear large sieve inequality, but only after the sum in the definition of $S_2$ has been partitioned into the ranges $(2^k,2^{k+1}]$ with $\log_2 U<k\leq \log_2(UV)$. This is because we obtain a much smaller result in the large sieve upper bound when doing this. Now setting $d=bc$, we have
\[\begin{eqnarray}&&\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\max_{y\leq x}\left|\sum_{ad\leq y\atop 2^k<d\leq 2^{k+1}}\chi(ad)\left(\sum_{\substack{b\leq V\\c\leq U\\bc=d}}\mu(b)\Lambda(c)\right)\right|\\&\ll& (z^2+\frac{x}{2^k})^{\frac{1}{2}}(z^2+2^k)^{\frac{1}{2}}\left(\frac{x}{2^k}\right)^{\frac{1}{2}}\left(2^k\log x\right)^{\frac{1}{2}}(\log x)\end{eqnarray}\]
where we took $\displaystyle a_n=1,b_n=\sum_{b\leq V,c\leq U\atop bc\leq n}\mu(b)\Lambda(c)$ in the bilinear large sieve inequality and applied the fact that $b_n\ll \log n$. In conclusion, the contribution of $S_2$ is at most a constant times
\[\begin{eqnarray}&&\max_{r\in [U,UV]}\left(z^2+\frac{x}{r}\right)^{\frac{1}{2}}(z^2+r)^{\frac{1}{2}}x^{\frac{1}{2}}(\log x)^{\frac{5}{2}}\\&\ll& \left(z+z\frac{x^{\frac{1}{2}}}{U^{\frac{1}{2}}}+z(UV)^{\frac{1}{2}}\log(UV)+x^{\frac{1}{2}}\right)x^{\frac{1}{2}}(\log x)^{\frac{5}{2}}.\end{eqnarray}\]
Lastly, we turn to estimating the contribution of the fourth term of Vaughan's identity. This resembles the previous case, because the fourth term contributes
\[\begin{eqnarray}\sum_{q\leq z}\frac{q}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}^{*}\max_{y\leq x}\left|\sum_{\substack{ab\leq y\\a>U\\b>V}}\Lambda(a)\left(\sum_{cd=b\atop d\leq V}\mu(d)\right)\chi(ab)\right|.\end{eqnarray}\]
Again, we apply the large sieve inequality separately for the ranges $a\in (2^k,2^{k+1}],\log_2 U<k\leq \log_2(UV)$. For a given $k$, the resulting upper bound is, up to a constant factor,
\[\begin{eqnarray}&&(z^2+\frac{x}{2^k})^{\frac{1}{2}}(z^2+2^k)^{\frac{1}{2}}\max_{y\leq x}\left(\sum_{2^k<a\leq 2^{k+1}}\Lambda(a)^2\right)^{\frac{1}{2}}\left(\sum_{\frac{y}{2^{k+1}}\leq b\leq \frac{y}{2^k}}\tau(b)^2\right)^{\frac{1}{2}}(\log x)\\&\ll& (z^2+\frac{x}{2^k})^{\frac{1}{2}}(z^2+2^k)^{\frac{1}{2}}\left(2^k\log x\right)^{\frac{1}{2}}\left(\frac{x}{2^k}\log^3 x\right)^{\frac{1}{2}}(\log x).\end{eqnarray}\]
The estimate for the sum of $\tau(b)^2$ was established in this post (and is straightforward using the classical bound for sum of $\tau(b)$). Summing over $k$ gives that the fourth term contributes at most
\[\begin{eqnarray}\left(z+z\frac{x^{\frac{1}{2}}}{U^{\frac{1}{2}}}+z(UV)^{\frac{1}{2}}\log(UV)+x^{\frac{1}{2}}\right)x^{\frac{1}{2}}\log^4 x.\end{eqnarray}\]
What remains now is optimizing $U$ and $V$. Thus all the terms in Vaughan's identity are bounded by
\[\begin{eqnarray}z^{\frac{5}{2}}U\log^2 x+z^{\frac{5}{2}}V\log^2 x+\left(zx^{\frac{1}{2}}\left(\frac{x^{\frac{1}{2}}}{U^{\frac{1}{2}}}+(UV)^{\frac{1}{2}}\right)+x\right)\log^5 x\quad (6)\end{eqnarray}\]
We optimize $U$ and $V$ in two different ways. To make the first two terms of the same size, we put $U=V.$ When we also equalize the terms $\frac{x^{\frac{1}{2}}}{U^{\frac{1}{2}}}$ and $(UV)^{\frac{1}{2}}$, we have $U=x^{\frac{1}{3}}$. This leads us to the bound
\[\begin{eqnarray}\left(z^{\frac{5}{2}}x^{\frac{1}{3}}+x+zx^{\frac{5}{6}}\right)\log ^5 x.\end{eqnarray}\]
For $z\leq x^{\frac{1}{3}}$, the inequality $z^{\frac{5}{2}}x^{\frac{1}{3}}\leq z^2x^{\frac{1}{2}}$ proves the claim. For $z>x^{\frac{1}{3}}$ we use a different optimization in $(6)$. We set the terms $z^{\frac{5}{2}}U$ and $\frac{zx}{U^{\frac{1}{2}}}$ equal, so that $U=\frac{x^{\frac{2}{3}}}{z}$. Then we set $z^{\frac{5}{2}}V$ and $zx^{\frac{1}{2}}(UV)^{\frac{1}{2}}$ equal with $V=\frac{x^{\frac{5}{3}}}{z^4}$, obtaining the bound
\[\begin{eqnarray}\left(z^{\frac{3}{2}}x^{\frac{2}{3}}+x+x^{\frac{5}{3}}z^{-\frac{5}{2}}\right)\log ^5 x.\end{eqnarray}\]
Taking into accont that now $z^{\frac{3}{2}}x^{\frac{2}{3}}\leq z^2x^{\frac{1}{2}}$ and $x^{\frac{5}{3}}z^{-\frac{5}{2}}\leq x^{\frac{5}{6}}z$, the proof is finished.■
Proof of Theorem 7. Let $A$ be given, and define
\[\begin{eqnarray}z=\frac{x^{\frac{1}{2}}}{\log^{A+6}x}.\end{eqnarray}\]
We have
\[\begin{eqnarray}\psi(x;q,a)=\frac{1}{\varphi(q)}\sum_{\chi\hspace{-2mm}\pmod q}\bar{\chi}(a)\psi(x;\chi)\ll \frac{1}{\varphi(q)}\left(\psi(x)+\sum_{\chi\hspace{-2mm}\pmod q\atop \chi\neq \chi_0}|\psi(x;\chi)|.\right)\end{eqnarray}\]
Now the left-hand side of the Bombieri-Vinogradov inequality is bounded by
\[\begin{eqnarray}\sum_{q\leq z}\frac{1}{\varphi(q)}\sum_{\chi \hspace{-2mm}\pmod q\atop \chi\neq \chi_0}|\psi(x;\chi)|+\sum_{q\leq z}\frac{|\psi(x)-x|}{\varphi(q)}.\quad (7)\end{eqnarray}\]
The prime number theorem with the classical error term $x\exp(-c\sqrt{\log x})$ tells that the second sum in $(7)$ is bounded by
\[\begin{eqnarray}x\exp(-c'\sqrt{\log x})\end{eqnarray}\]
for some $c>0$.
Notice that if $\chi$ is not a primitive character $\pmod q$ and is induced by a character $\chi'\pmod k$, then trivially $|\psi(x;\chi)-\psi(x;\chi_1)|\ll (\log x)\sum_{p\mid q}\log p\leq (\log x)\log q$. Making use of this, the first sum in $(7)$ can be estimated to be at most
\[\begin{eqnarray}&&\sum_{q\leq z}\frac{1}{\varphi(q)}\sum_{k\mid q}\sum^{*}_{\chi\hspace{-2mm}\pmod k\atop \chi\neq \chi_0}|\psi(x;\chi)|+\sum_{q\leq z}\frac{1}{\varphi(q)}(\log x)\log q\\&\ll& \sum_{q\leq z}\frac{1}{\varphi(q)}\sum_{k\mid q}\sum^{*}_{\chi\hspace{-2mm}\pmod k\atop \chi\neq \chi_0}|\psi(x;\chi)|+\log^4 x\\&=&\sum_{k\leq z}\sum_{r\leq\frac{z}{k}}\frac{1}{\varphi(kr)}\sum^{*}_{\chi\hspace{-2mm}\pmod k\atop \chi\neq \chi_0}|\psi(x;\chi)|+\log^4 x\\&\ll&\sum_{r\leq z}\sum_{k\leq \frac{z}{r}}\frac{1}{\varphi(k)\varphi(r)}\sum^{*}_{\chi\hspace{-2mm}\pmod k\atop \chi\neq \chi_0}|\psi(x;\chi)|+\log^4 x,\quad (8)\end{eqnarray}\]
where * again denotes a sum over primitive characters and we used $\varphi(q)\gg \frac{q}{\log q}$ as well as $\varphi(kr)\geq \varphi(k)\varphi(r)$. For $r\leq \frac{z}{\log^{C}z}$ we use Lemma 8, while for larger values of $r$ we have the Siegel-Walfisz theorem
\[\begin{eqnarray}|\psi(x;\chi)|\ll \frac{x}{\varphi(k)}\exp(-c_0\sqrt{\log x})\end{eqnarray}\]
available (the implied constant is absolute). Hence $(8)$ is bounded by
\[\begin{eqnarray}&&\sum_{r\leq \frac{z}{\log^C z}}\frac{1}{\varphi(r)}\sum_{k\leq \frac{z}{r}}\frac{1}{k}\cdot\frac{k}{\varphi(k)}\sum^{*}_{\chi\hspace{-2mm}\pmod k\atop \chi\neq \chi_0}|\psi(x;\chi)|\\&&+\sum_{\frac{z}{\log^C z}\leq r\leq z}\,\,\,\sum_{k\leq \frac{z}{r}}\frac{1}{\varphi(k)\varphi(r)}x\exp(-c_0\sqrt{\log x})+\log^4 x\\&\ll&_C \sum_{r\leq \frac{z}{\log^C z}}\frac{1}{\varphi(r)}\sum_{k\leq \frac{z}{r}}\frac{1}{k^2}\left(k^{2}x^{\frac{1}{2}}+x+k^{\frac{3}{2}}x^{\frac{2}{3}}\right)\log^5 x+x\exp(-c_1\sqrt{\log x})\\&\ll&_C(zx^{\frac{1}{2}}+x\log^{1-C}z+z^{\frac{1}{2}}x^{\frac{2}{3}}+x\exp(-c_1\sqrt{\log x}))\log^5 x\end{eqnarray}\]
using partial summation. Notice that splitting the range of summation was vital for achieving a saving of a power of the logarithm. With the choices $C=A+6,z=\frac{\sqrt{x}}{\log^{A+6}x}$ the bound simplifies to
\[\begin{eqnarray}\ll_A \frac{x}{\log^A x},\end{eqnarray}\]
which is the desired bound. The proof is now complete. ■
Naturally, one may replace the Chebycheb prime counting function in the Bombieri-Vinogradov theorem by $\pi(x;q,a)$, as is done in the following corollary to the theorem.
Corollary. With the assumptions of Theorem 7, we have
\[\begin{eqnarray}\sum_{q\leq \frac{\sqrt{x}}{\log^B x}}\max_{y\leq x}\,\max_{a\in \mathbb{Z}_{q}^{\times}}\left|\pi(x;q,a)-\frac{Li(x)}{\varphi(q)}\right|\ll \frac{x}{\log^A x},\quad (9)\end{eqnarray}\]
when $B>A+6$.
Proof. Partial summation gives
\[\begin{eqnarray}\pi(x;q,a)=\frac{Li(x)}{\varphi(q)}+\int_{2}^{x}\frac{\psi(t;q,a)-\frac{t}{\varphi(q)}}{t\log t}dt.\end{eqnarray}\]
Putting this into the Bombieri-Vinogradov inequality and changing the order of integration and summation gives shows that $(9)$ is bounded by
\[\begin{eqnarray}\frac{x}{\log^{A+\varepsilon} x}\cdot \int_{2}^{x}\frac{1}{t\log t}\ll \frac{x}{\log^A x},\end{eqnarray}\]
for $\varepsilon=B-A-6$, as wanted. ■
The Titchmarsh divisor problem
We have already applied the Bombieri-Vinogradov theorem in this post to bound from above the number of primes of the form $ap+b$ and the number of representations of an even integer as the sum of two primes. Now we give another application, which would be much more difficult to prove without the Bombieri-Vinogradov theorem. We are going to compute the average number of divisors of numbers of the form $p+a$, where $p$ runs through the primes and $a\neq 0$ is fixed. More precisely, we are going to show the following.
Theorem 9 (Titchmarsh divisor problem). Let $a\neq 0$ be an integer. Then there exists an absolute constant $c_0>0$ for which
\[\begin{eqnarray}\sum_{p\leq x}\tau(p+a)=c_0x+O\left(\frac{x\log \log x}{\log x}\right).\end{eqnarray}\]
Proof. We have $\tau(n)=2\sum_{d\mid n\atop d\leq \sqrt{n}}1$ whenever $n$ is not a perfect square. Hence, for any $\varepsilon>0,$
\[\begin{eqnarray}&&\sum_{|a|\leq \leq p\leq x}\tau(p+a)=\sum_{p\leq x}\sum_{d\mid p+a\atop d\leq \sqrt{p+a}}1+O(x^{\frac{1}{2}+\varepsilon})\\&=&\sum_{d\leq \sqrt{x+a}}\sum_{-a\leq p\leq x\atop p\equiv -a\hspace{-0.2 cm}\pmod d}1+O(x^{\frac{1}{2}+\varepsilon})\\&=&\sum_{d\leq \sqrt{x}}\pi(x;d,-a)+O(x^{\frac{1}{2}+\varepsilon}).\quad (10)\end{eqnarray}\]
Take $A=8$, for instance. For $d\leq \frac{\sqrt{x}}{\log^A x}$, the Bombieri-Vinogradov theorem produces a small enough estimate, while for larger $d$ the Brun-Titchmarsh inequality (from this post) turns out to be sufficient. We have
\[\begin{eqnarray}&&\sum_{d\leq \frac{\sqrt{x}}{\log^A x}}\left(\pi(x;d,-a)-\frac{Li(x)}{\varphi(d)}\right)+\sum_{d\leq \frac{\sqrt{x}}{\log^A x}}\frac{Li(x)}{\varphi(d)}\\&=&\frac{x}{\log x}\left(1+O\left(\frac{1}{\log x}\right)\right)\sum_{d\leq \frac{\sqrt{x}}{\log^A x}}\frac{1}{\varphi(d)}+O\left(\frac{x}{\log^{A-6}x}\right)\\&=&\frac{x}{\log x}\cdot c\log x+O\left(\frac{\log \log^A x}{\log \log x}+\frac{x}{\log^{A-6}x}\right),\end{eqnarray}\]
where we used the Bombieri-Vinogradov theorem and
\[\begin{eqnarray}\sum_{d\leq y}\frac{1}{\varphi(d)}=c\log y+O(1)\quad (11)\end{eqnarray}\]
for some constant $c>0.$ That formula follows just by writing
\[\begin{eqnarray}\frac{1}{\varphi(d)}=\frac{1}{d}\prod_{p\mid d}\frac{1}{1-\frac{1}{p}}=\sum_{k\mid d}\frac{\mu^2(k)}{k},\end{eqnarray}\]
and summing first over $k$ and then over $d$ in $(11)$.
For the contribution of the terms $d>\frac{\sqrt{x}}{\log^A x}$ in $(10)$, Brun-Titchmarsh inequality gives the upper estimate
\[\begin{eqnarray}\sum_{\frac{\sqrt{x}}{\log^A x}< d\leq \sqrt{x}}\pi(x;d,-a)\ll \sum_{\frac{\sqrt{x}}{\log^A x}< d\leq \sqrt{x}}\frac{x}{\varphi(d)\log \frac{x}{d}}\ll \frac{x}{\log x}\log \log x.\end{eqnarray}\]
This finishes the proof. ■