## Saturday, January 31, 2015

### The Rosser-Iwaniec sieve

We derive the Rosser-Iwaniec sieve, which gives both upper and lower bounds for the linear sieve problem that are asymptotically optimal. Linearity of the sieve problem means that $v(p)$, the number of residue classes sifted modulo $p$, is on average $1$. The optimal examples are given by the sets of integers with odd or even number of prime factors. This reflects the parity problem of classical sieves, which states that one cannot distinguish between integers with odd and even number of prime factors by using merely type I sums, such as the error sum
$\begin{eqnarray}\sum_{d\leq D}|R_d|; \quad \quad R_d=|A_d|-\frac{v(d)}{d}|A|\end{eqnarray}$
that occurs also in the Rosser-Iwaniec sieve.

Obtaining lower bound sieves is often more challenging than obtaining upper bound sieves. One reason is that all the error terms must be of lower order than the main term, or our lower bound becomes zero. In addition, some sieves, such as the Selberg sieve, are fundamentally based on inequalities that have no clear lower bound counterpart (the Selberg sieve was based on the trivial observation that $1_{\{1\}}(n)\leq \left(\sum_{d\mid n}\lambda_d\right)^2$ for any $\lambda_i$ with $\lambda_1=1$). However, Buchstab's identity (from this post), which is just a number theoretic inclusion-exclusion principle, can be used to overcome this difficulty by making the signs of the terms we can upper bound negative. Buchstab's identity is also the basis of the Rosser-Iwaniec sieve (even for the upper bound sieve). In the course of deriving the Rosser-Iwaniec sieve, we will also prove the fundamental lemma of the sieve, which is rather elementary but gives optimal results when the sifting parameter is small. In the end, we apply the Rosser-Iwaniec sieve to prove that there are infinitely many primes $p$ such that $p+2$ is almost a prime; namely has no more than three prime factors. We follow Harman's book Prime Detecting Sieves.
Buchstab's function

Lower bound sieves generally require some new parameters compared to the upper bound case, and a rather complicated function that describes the ratio between $xW(z)$ (which is the lower bound one would naively expect by assuming independence of divisibility by different primes) and the real lower bound. Using such a function is beneficial even for upper bound sieves if we cannot afford to lose constant factors. We know that the prime number theorem can be written as
$\begin{eqnarray}S(A,\mathbb{P},x^{\frac{1}{2}})=\pi(2x)-\pi(x)+O(1)\sim \frac{1}{2}\cdot \frac{x}{\log x^{\frac{1}{2}}}\quad (1),\end{eqnarray}$
where $A=[x,2x]\cap \mathbb{Z}$. However, we are soon going to prove the fundamental lemma of sieve theory, which tells among other things that
$\begin{eqnarray}S(A,\mathbb{P},x^{\varepsilon})=(1+O(E(\varepsilon)))e^{-\gamma}\frac{x}{\log x^{\varepsilon}}\quad (2),\end{eqnarray}$
where $\gamma$ is Euler's constant and $E$ is some function tending to zero as the argument tends to zero. This should be anticipated based on the Mertens' theorem (see this post)
$\begin{eqnarray}\prod_{p\leq z}\left(1-\frac{1}{p}\right)\sim \frac{e^{-\gamma}}{\log z}.\end{eqnarray}$
We see that if the probabilistic guess $S(A,P,z)\sim xW(z)$ works for small $z$, we have the claim $(2)$.

From $(1)$ and $(2)$, we expect there to be a relation of the form
$\begin{eqnarray}S(A,\mathbb{P},z)\sim \omega(s)\frac{x}{\log z},\end{eqnarray}$
where $s=\frac{\log x}{\log z}$, so that the two mentioned results are bonded together in the sense that $\omega(2)=\frac{1}{2}$ and $\omega(s)\to e^{-\gamma}$ as $s\to \infty$. It turns out that there is such a function, known as Buchstab's function. Before we can derive the Rosser-Iwaniec sieve, we must discover some properties of Buchstab's function and prove the fundamental lemma of the sieve.

We define Buchstab's function $\omega(u)$ by the following recursion:
$\begin{eqnarray}\omega(u)=\begin{cases}\frac{1}{u},\quad 1\leq u\leq 2\\ \frac{1}{u}\left(k\omega(k)+\int_{k}^{u}\omega(v-1)dv\right),k\leq u\leq k+1,\end{cases}\end{eqnarray}$
where $k\geq 2$ is an integer. It turns out that $\omega$ is a bounded continuous function that is the unique continuous solution to a delayed differential equation. This will be shown in the following lemma.

Lemma 1. (i) We have $\frac{1}{2}\leq \omega(u)\leq 1$ for all $u$.
(ii) $\omega$ is continuous.
(iii) The only continuous solution to the delayed differential equation $(uf(u))'=f(u-1)$ with the condition $f(u)=\frac{1}{u},1\leq u\leq 2$, is $f(u)=\omega(u).$
(iv) We have $\omega(u)=\frac{1+\log(u-1)}{u}$ for $2\leq u\leq 3$.

Proof. (i) We use induction. For $1\leq u \leq 2$, the statement is obvious. Assume it holds for $k\leq u\leq k+1$. Then, for $k+1\leq u\leq k+2$,
$\begin{eqnarray}\omega(u)\leq \frac{1}{u}\left((k+1)\cdot 1+(u-k-1)\right)=1\end{eqnarray}$
and
$\begin{eqnarray}\omega(u)\geq \frac{1}{u}\left(\frac{k+1}{2}+\frac{(u-k-1)}{2}\right)=\frac{1}{2},\end{eqnarray}$
so the inequalities hold also for $k+1\leq u\leq k+2$.

(ii) It suffices to show that $u\omega(u)$ is continuous. We again use induction, the cases $1\leq u\leq 2$ being obvious. We assume continuity on $[k,k+1)$. As $\omega$ is a bounded function, we have
$\begin{eqnarray}u \omega(u)=(k+1)\omega(k+1)+\int_{k+1}^{u}\omega(v-1)dv\to (k+1)\omega(k+1)\end{eqnarray}$
as $u\to k+1$ from the right. Therefore, $u\omega(u)$ is right continuous at $k+1$, and similarly it is seen to be left continuous. Now, for small $\varepsilon>0$, we have
$\begin{eqnarray}&&(u+\varepsilon) \omega(u+\varepsilon)=(k+1)\omega(k+1)+\int_{k+1}^{u+\varepsilon}\omega(v-1)dv\\&&\to (k+1)\omega(k+1)+\int_{k+1}^{u}\omega(v-1)dv=u\omega(u)\end{eqnarray}$
as $\varepsilon \to 0$.

(iii) Directly form the definition of $\omega$ and from (ii), we see that $\omega$ is a continuous solution to the equation. Suppose there are two continuous solutions, say $f$ and $g$. Let $h=f-g$. We have $h(u)=0$ for $1\leq u\leq 2$, and from the equation it is clear that $h(u)=0$ for all $u$. This proves uniqueness.

(iv) A direct computation gives
$\begin{eqnarray}\omega(u)=\frac{1}{u}\left(1+\int_{2}^{u}\frac{dv}{v-1}\right)=\frac{1+\log(u-1)}{u}\end{eqnarray}$
for $2\leq u\leq 3.$ ■

So far we have not connected Buchstab's function to sieve theory in any way. However, next we will establish $S(A,\mathbb{P},z)\sim u \omega(s)\frac{x}{\log z}$, considering a short interval ($A=[x-y,x]$) instead of $[x,2x]$ and having a lot of uniformity in dependence on the sifting exponent. The sieve notation used is standard, but may be recalled from this post.

Theorem 2. Define $\eta(x)=\exp(-\sqrt{\log x})$ and $g(x,k)=\exp\left(-\left(\frac{\log x}{k}\right)^{\frac{4}{7}}\right)$. Let $B=[x-y,x]$, where $y\leq x\eta(x)$. For $x>\exp(k+1)^8$ and $y\in [xg(x,k),x\eta(x)]$, we have
$\begin{eqnarray}S(B,\mathbb{P},x^{\frac{1}{u}})=\frac{y}{\log x}u\omega(u)+O\left(y(k+1)\exp\left(-\sqrt{\frac{\log x}{u}}\right)\right)\end{eqnarray}$
for $k\leq u\leq k+1,$ with an absolute implied constant.

Notice that $\frac{u}{\log x}=\frac{1}{\log x^{\frac{1}{u}}}$. Also notice that Theorem 2 implies an asymptotic for the sifting function on a dyadic interval:
$\begin{eqnarray}S([x,2x],\mathbb{P},x^{\frac{1}{u}})&=&y\sum_{k\leq \frac{x}{y}}\frac{\omega(u)}{x+ky}+O\left((k+1)\exp\left(-\sqrt{\frac{\log x}{u}}\right)\right)\\&\sim& u\omega(u)\int_{x}^{2x}\frac{dt}{\log t}\\&\sim& u\omega(u)\frac{x}{\log x}\end{eqnarray}$
for fixed $u$ by comparing the sum to the corresponding integral.

Proof. For $1\leq u\leq 2$, we have
$\begin{eqnarray}S(B,\mathbb{P},x^{\frac{1}{u}})=S(B,\mathbb{P},x^{\frac{1}{2}})&=&\frac{y}{\log x}+O(x\exp(-2(\log x)^{\frac{4}{7}}))\\&=&\frac{y}{\log x}+O(y\exp(-\sqrt{2\log x})\end{eqnarray}$
by the prime number theorem with Chudakov's error term (see this post).

We assume the cases $k\leq u\leq k+1$ and proceed to prove the cases $k+1\leq u\leq k+2$.

The key to the proof is Buchstab's identity, which gives
$\begin{eqnarray}S(B,\mathbb{P},x^{\frac{1}{u}})&=&S(B,\mathbb{P},x^{\frac{1}{k+1}})+\sum_{x^{\frac{1}{u}}<p\leq x^{\frac{1}{k+1}}}S(B_p,\mathbb{P},p)\\&=&S(B,\mathbb{P},x^{\frac{1}{k+1}})+\sum_{x^{\frac{1}{u}}<p\leq x^{\frac{1}{k+1}}}S\left(B_p,\mathbb{P},\left(\frac{x}{p}\right)^{\frac{\log p}{\log \frac{x}{p}}}\right).\end{eqnarray}$
We have $\frac{\log p}{\log \frac{x}{p}}\in \left[\frac{1}{k+1},\frac{1}{k}\right]$, so we may apply the induction hypothesis, provided that the conditions are satisfied. As $\log x>(k+2)^8$, we have $\log \frac{x}{p}\geq \left(1-\frac{1}{k+1}\right)(k+2)^8>(k+1)^8$ and $y\leq x\eta(x)$ implies $\frac{y}{p}\leq \frac{x}{p}\eta\left( \frac{x}{p}\right)$. Furthermore, $y\geq xg(x,k+1)$ implies $\frac{y}{p}\geq \frac{x}{p}g(x,k+1)\geq \frac{x}{p}g\left(\frac{x}{p},k\right)$, because
$\begin{eqnarray}g\left(\frac{x}{p},k\right)=\exp\left(\left(\frac{\log \frac{x}{p}}{k}\right)^{\frac{4}{7}}\right)\leq \exp\left(-\left(\frac{(\log x)(1-\frac{1}{k+1})}{k}\right)^{\frac{4}{7}}\right)=g(x,k+1).\end{eqnarray}$

Now we get
$\begin{eqnarray}S(B,\mathbb{P},x^{\frac{1}{u}})&=&\frac{y}{\log x}(k+1)\omega(k+1)+O\left(y(k+1)\exp\left(-\sqrt{\frac{\log x}{u}}\right)\right)\\&+&\sum_{x^{\frac{1}{u}}<p\leq x^{\frac{1}{k+1}}} \frac{y}{p\log \frac{x}{p}}\left(\frac{\log \frac{x}{p}}{\log p}\omega\left(\frac{\log \frac{x}{p}}{\log p}\right)\right)\\&+&O\left(\sum_{x^{\frac{1}{u}}<p\leq x^{\frac{1}{k+1}}}(k+1)\frac{y}{p}\exp(-\sqrt{\log p})\right)\quad (3).\end{eqnarray}$
Denote the second last sum by $S_1$ and the last one by $S_2$. We find
$\begin{eqnarray}S_2&\leq& y(k+1)\exp\left(-\sqrt{\frac{\log x}{u}}\right)\sum_{x^{\frac{1}{u}}<p<x^{\frac{1}{k+1}}}\frac{1}{p}\\&\leq& y(k+1)\exp\left(-\sqrt{\frac{\log x}{u}}\right) \left(\log\left(\frac{\log x}{k+1}\right)-\log \left(\frac{\log x}{u}\right)+O\left(\frac{1}{\log x}\right)\right)\\&\leq& y(k+1)\exp\left(-\sqrt{\frac{\log x}{u}}\right)\left(\log \frac{k+2}{k+1}+O\left(\frac{k}{\log x}\right)\right)\\&\leq& 2y\exp\left(-\sqrt{\frac{\log x}{u}}\right)\end{eqnarray}$
by Mertens' second theorem and by the assumption $\log x>k^8$. We now compute the main term $S_1$.
$\begin{eqnarray}S_1&=&\sum_{x^{\frac{1}{u}}<p\leq x^{\frac{1}{k+1}}} \frac{y}{p\log p}\omega\left(\frac{\log \frac{x}{p}}{\log p}\right)\\&=&\sum_{x^{\frac{1}{u}}<n\leq x^{\frac{1}{k+1}}} \frac{y}{n \log^2 n}\omega\left(\frac{\log \frac{x}{n}}{\log n}\right)1_{\mathbb{P}}(n)\log n.\end{eqnarray}$
If $\theta(x)=\sum_{p\leq x}\log p$ denotes the Chebychev function, partial summation gives
$\begin{eqnarray}S_1=\theta\left(x^{\frac{1}{k+1}}\right)f\left(x^{\frac{1}{k+1}}\right)-\theta\left(x^{\frac{1}{u}}\right)f\left(x^{\frac{1}{u}}\right)-\int_{x^{\frac{1}{u}}}^{x^{\frac{1}{k+1}}}\theta(t)f'(t)dt,\end{eqnarray}$
where
$\begin{eqnarray}f(t)=\frac{y}{t\log^2 t}\omega\left(\frac{\log x}{\log t}-1\right).\end{eqnarray}$
The prime number theorem with Chudakov's error term gives
$\begin{eqnarray}S_1&=&x^{\frac{1}{k+1}}f\left(x^{\frac{1}{k+1}}\right)-x^{\frac{1}{u}}f\left(x^{\frac{1}{u}}\right)-\int_{x^{\frac{1}{u}}}^{x^{\frac{1}{k+1}}}tf'(t)dt\\&+&O\left((k+1)^2 y\exp\left(-\left(\frac{\log x}{k+1}\right)^{\frac{4}{7}}\right)\right)\\&=&\int_{x^{\frac{1}{u}}}^{x^{\frac{1}{k+1}}}f(t)dt+O\left(y\exp\left(-\sqrt{\frac{\log x}{u}}\right)\right),\end{eqnarray}$
where the fact that $f\left(x^{\frac{1}{k+1}}\right)\leq (k+1)^2 \frac{y}{x^{\frac{1}{k+1}}}$ and partial integration were employed. Now
$\begin{eqnarray}S_1&=&\int_{x^{\frac{1}{u}}}^{x^{\frac{1}{k+1}}}\frac{y}{t\log^2 t}\omega\left(\frac{\log x}{\log t}-1\right)dt+O\left(y\exp\left(-\sqrt{\frac{\log x}{u}}\right)\right)\\&=&-\int_{u}^{k+1}\omega(v-1)dv\cdot \frac{y}{\log x}\end{eqnarray}$
with the substitution $v=\frac{\log x}{\log t},$ $dv=-\frac{\log x}{t\log^2 t}dt$. The main terms in $(3)$ sum up to
$\begin{eqnarray}\frac{y}{\log x}\left((k+1)\omega(k+1)+\int_{k+1}^{u}\omega(v-1)dv\right)=u\omega(u)\frac{y}{\log x}.\end{eqnarray}$
The error terms contribute at most
$\begin{eqnarray}Cy(k+1)\exp\left(-\sqrt{\frac{\log x}{u}}\right)+Cy\exp\left(-\sqrt{\frac{\log x}{u}}\right)=C(k+2)\exp\left(-\sqrt{\frac{\log x}{u}}\right),\end{eqnarray}$
where $C$ is an absolute constant. This completes the proof. ■

Fundamental lemma of the sieve

The fundamental lemma of the sieve (which, by the way, is not always formulated in the same way) enables one to deal with the sifting function very accurately, as long as the sifting parameter ($z$ in the notation $S(A,\mathbb{P},z)$) stays small. This result will be needed to prove the Rosser-Iwaniec sieve, where the sifting parameter can be large.

Lemma 3. Let $Q=\prod_{i=1}^{r} p_i,\,\,\,p_i\leq w$ and $E\geq 1.$ Then
$\begin{eqnarray}\left|\sum_{d\mid Q\atop d\geq E}\mu(d)\right|\leq \sum_{d\mid Q\atop E\leq d<Ew}1.\end{eqnarray}$

Proof. We induct on $r$. If $r=1$,
$\begin{eqnarray}\sum_{d\mid p\atop d\geq E}\mu(d)=\begin{cases}0,\quad E>p\,\,\text{or}\,\,E=1\\-1,\quad 1\leq E\leq p,\end{cases}\end{eqnarray}$
and $\sum_{d\mid p\atop E\leq d<pE}1$ clearly has the same values. In the induction step, let $Q'=Qp$ with $\omega(Q)=r,p\nmid Q$. Now
$\begin{eqnarray}\sum_{d\mid Qp\atop d\geq E}\mu(d)=\sum_{d\mid Q\atop d\geq E}\mu(d)-\sum_{d\mid Q\atop d\geq \frac{E}{p}}\mu(d):=S_1-S_2.\end{eqnarray}$
By the induction assumption,
$\begin{eqnarray}|S_1|\leq \sum_{d\mid Q\atop E\leq d\leq Ew}1,\quad |S_2|\leq \sum_{d\mid Q\atop \frac{E}{p}\leq d<\frac{Ew}{p}}1.\end{eqnarray}$
It holds that
$\begin{eqnarray}\sum_{d\mid Q\atop \frac{E}{p}\leq d<\frac{Ew}{p}}1=\sum_{p\mid d\mid Qp\atop E\leq d<Ew}1\end{eqnarray}$
and
$\begin{eqnarray}\sum_{d\mid Qp\atop E\leq d<Ew}1+\sum_{p\mid d\mid Qp\atop E\leq d<Ew}1=\sum_{d\mid Qp\atop E\leq d<Ew}1,\end{eqnarray}$
so the proof is complete. ■

The previous lemma will be utilized in proving the following.

Lemma 4. Let $a,u>0,w=x^{\frac{1}{u}},D=x^a.$ Assume $\frac{1}{a}<u<(\log x)^{1-\varepsilon}$ and that $v$ is a multiplicative function on the square free integers with the property
$\begin{eqnarray}\left|\sum_{p\leq z}\frac{v(p)\log p}{p}-\kappa \log z\right|\leq C\end{eqnarray}$
for all $z\geq 2.$ Then $\begin{eqnarray}\sum_{d\mid P(w)\atop d>D}\frac{v(d)}{d}\ll_{C} \exp\left(\kappa\log \log w+ua-ua\log \left(\frac{ua}{\kappa}\right)\right).\end{eqnarray}$

We mention that the parameter $\kappa$ is usually called the dimension of the sieve problem, and it is the average value of $v(p)$ in the sense above. It often plays a crucial role in sieve estimates, and the smaller it is, the better bounds we expect. For example, if we are sifting in the set $[1,x]$, we have $\kappa=1$, whereas sifting in $\{a^2+b^2\leq x\}$ means $\kappa=\frac{1}{2}$.

Proof. We use Rankin's trick to estimate, for any $\alpha>0$,
$\begin{eqnarray}\sum_{d\mid P(w)\atop d>D}\frac{v(d)}{d}&\leq& \frac{1}{D^{\alpha}}\sum_{d\mid P(w)}\frac{v(d)}{d^{1-\alpha}}\\&=&\frac{1}{D^{\alpha}}\prod_{p<w}(1+v(p)p^{\alpha-1})\\&=&\frac{1}{D^{\alpha}}\exp\left(\sum_{p<w}\log(1+v(p)p^{\alpha-1})\right)\\&\leq& \frac{1}{D^{\alpha}}\exp\left(\sum_{p<w}\frac{v(p)(p^{\alpha}-1)}{p}+\sum_{p<w}\frac{v(p)}{p}\right).\end{eqnarray}$

By partial summation it is easy to see that the condition for $v$ is equivalent to
$\begin{eqnarray}\left|\sum_{p\leq z}\frac{v(p)}{p}-\kappa\log \log z\right|\leq C'.\end{eqnarray}$
Let $A>0$ be so large that
$\begin{eqnarray}\sum_{A<p\leq z}\frac{v(p)\log p}{p}\leq \kappa \log z,\quad \sum_{A<p\leq z}\frac{v(p)}{p}\leq \kappa \log \log z.\end{eqnarray}$

From the fact that $t\mapsto \frac{t^{\alpha}-1}{\log t}$ is increasing on $(1,\infty)$, we infer
$\begin{eqnarray}\sum_{d\mid P(w)\atop d>D}\frac{v(d)}{d}&\ll_{C}& \frac{1}{D^{\alpha}}\exp\left(\sum_{A<p<w}\frac{v(p)\log p}{p\log w}(w^{\alpha}-1)+\kappa \log \log z\right)\\&\leq& \exp\left(\kappa w^{\alpha}-\alpha a\log x+\kappa \log \log w\right).\end{eqnarray}$

We finally optimize the value of $\alpha$ by minimizing $f(\alpha)=w^{\alpha}-\alpha a \log x$. We have $f'(\alpha)=w^{\alpha}\log w-a\log x=0$ if and only if $w^{\alpha}=au$, or equivalently $\alpha=\frac{\log(au)}{\log w}$. Setting $\alpha$ equal to this value, we obtain the claim. ■

We have now all the tools to prove the fundamental lemma.

Theorem 5 (Fundamental lemma of the Rosser-Iwaniec sieve). Let $w=x^{\frac{1}{u}},D=x^a$ and $\frac{1}{a}<u<(\log x)^{1-\delta}.$ Suppose $|A_d|=|A|\frac{v(d)}{d}+R_d$ with $v$ a multiplicative function on the square free integers satisfying the same condition as in Lemma 4. Then
$\begin{eqnarray}S(A,\mathbb{P},w)=|A|W(w)(1+O(\varepsilon))+O\left(\sum_{d\leq Dw}|R_d|\right),\end{eqnarray}$
where
$\begin{eqnarray}\varepsilon=\exp\left(2\kappa\log \log w+ua-ua\log \frac{ua}{\kappa}\right).\end{eqnarray}$
The implicit constants may depend on $C$, where $C$ is as in Lemma 4.

The fundamental lemma tells in particular that whenever $w$ is sufficiently small, say $w=x^{(\log \log x)^{-1}}$, we have the asymptotic
$\begin{eqnarray}S(A,\mathbb{P},w)=(1+o(1))|A|W(w)+O\left(\sum_{d\leq x^{\delta}}|R_d|\right).\end{eqnarray}$

We remark that in the case $A=[x,2x],$ we see that Buchstab's function satisfies $\omega(u)\to e^{-\gamma}$ as $u\to \infty.$

Proof. We split the result that the Eratosthenes-Legendre formula (from this post) gives for $S(A,\mathbb{P},w)$ as follows:
$\begin{eqnarray}S(A,\mathbb{P},w)=\sum_{d\mid P(w)\atop d\leq D}\mu(d)|A_d|+\sum_{d\mid P(w)\atop d>D}\mu(d)|A_d|:=S_1+S_2.\end{eqnarray}$

By Lemma 4,
$\begin{eqnarray}|S_2|&=&\left|\sum_{n\in A}\sum_{d\mid (P(w),n)\atop d>D}\mu(d)\right|\\&\leq& \sum_{n\in A}\sum_{d\mid (P(w),n)\atop D<d\leq Dw}1\\&=&\sum_{d\mid P(w)\atop D<d\leq Dw}|A_d|\\&=&|A|\sum_{d\mid P(w)\atop D<d\leq Dw}\frac{v(d)}{d}+\sum_{d\mid P(w)\atop D<d\leq Dw}|R_d|\\&\ll& \frac{|A|}{(\log w)^{\kappa}}\exp\left(2\kappa \log \log w+ua-ua\log\frac{ua}{\kappa}\right)+\sum_{d\leq Dw}|R_d|.\end{eqnarray}$

Moreover,
$\begin{eqnarray}S_1&=&\sum_{d\mid P(w)\atop d\leq D}\mu(d)|A_d|\\&=&|A|\sum_{d\mid P(w)\atop d\leq D}\frac{\mu(d)v(d)}{d}+\sum_{d\mid P(w)\atop d\leq D}\mu(d)|R_d|\\&=&|A|\sum_{d\mid P(w)}\frac{\mu(d)v(d)}{d}-|A|\sum_{d\mid P(w)\atop d> D}\frac{\mu(d)v(d)}{d}+\sum_{d\mid P(w)\atop d\leq D}\mu(d)|R_d|.\end{eqnarray}$
The first sum equals $|A|W(w)$, whereas the second sum is
$\begin{eqnarray}\ll |A|\sum_{d\mid P(w)\atop d>D}\frac{v(d)}{d}\ll \frac{|A|}{(\log w)^{\kappa}}\cdot \varepsilon.\end{eqnarray}$
Partial summation shows that $W(w)\asymp (\log w)^{-\kappa}$. Hence the statement follows. ■

Extremal examples of the Rosser-Iwaniec sieve

To formulate the Rosser-Iwaniec sieve, we need to define the two extremal examples. Define $\mathcal{B}^{-}=\{n\leq D: \lambda(n)=1\}, \mathcal{B}^{+}=\{n\leq D:\lambda(n)=-1\}$, where $\lambda(n)$, the Liouville function is $1$ if $n$ has an even number of prime factors, and $-1$ otherwise. The prime number is equivalent to $|\mathcal{B}^{-}|,|\mathcal{B}^{+}|=(1+o(1))\frac{D}{2}$. In fact, we have the following lemma.

Lemma 6. We have $|\mathcal{B}^{-}|-|\mathcal{B}^{+}|=\sum_{n\leq D}\lambda(n)\ll_{c} D\exp(-c\sqrt{\log D})$ for some $c>0$.

Proof. If we show $\sum_{n\leq x}\mu(n)\ll_{c} x\exp(-c\sqrt{\log x})$, we obtain
$\begin{eqnarray}\sum_{n\leq D}\lambda(n)=\sum_{k\leq \sqrt{D}}\sum_{n\leq \frac{D}{k^2}}\mu(n)\ll_{c} \sum_{k\leq \sqrt{D}}\frac{D}{k^2}\exp\left(-c\sqrt{\log \frac{D}{k^2}}\right)\end{eqnarray}$
The terms $k\leq D^{\frac{1}{4}}$ contribute
$\begin{eqnarray}\ll \sum_{k\leq \sqrt{D}}\frac{D}{k^2}\exp\left(-\frac{c}{2}\sqrt{\log D}\right)\ll D(\log D)\exp\left(-\frac{c}{2}\sqrt{\log D}\right).\end{eqnarray}$
On the other hand, the terms $k>D^{\frac{1}{4}}$ contribute
$\begin{eqnarray}\ll \sum_{D^{\frac{1}{4}}<k\leq \sqrt{D}}\frac{D}{k^2}\ll D^{\frac{3}{4}}.\end{eqnarray}$
Hence we see that it is enough to show the inequality of the lemma for the Möbius function. This can be done in the same way as one proves the error term $\ll_{c} x\exp(-c\sqrt{\log x})$ in the prime number theorem using Perron's formula.■

We may now formulate the Rosser-Iwaniec sieve.

Theorem 7 (Rosser-Iwaniec sieve). Let $x\geq D\geq z^2\geq 2$, and let $A\subset [1,x]$ be a set of integers. Suppose $A_d=|A|\frac{v(d)}{d}+R_d$ for square free $d$ with $v$ being multiplicative and satisfying the condition of Lemma 4 (that is, $v(p)$ has an average). Then
$\begin{eqnarray}S(A,\mathbb{P},z)\leq |A|e^{\gamma}W(z)\left(\frac{S(\mathcal{B}^{+},\mathbb{P},z)\log z}{\frac{1}{2}D}+O(\log_3 D)^{-1}\right)+O\left(\sum_{d\leq D\atop \mu(d)\neq 0}|R_d|\right)\end{eqnarray}$
and
$\begin{eqnarray}S(A,\mathbb{P},z)\geq |A|e^{\gamma}W(z)\left(\frac{S(\mathcal{B}^{-},\mathbb{P},z)\log z}{\frac{1}{2}D}+O(\log_3 D)^{-1}\right)+O\left(\sum_{d\leq D\atop \mu(d)\neq 0}|R_d|\right),\end{eqnarray}$
where $\log_3$ is the third iterated logarithm and $W(z)=\prod_{p\leq z}\left(1-\frac{v(p)}{p}\right).$

It is now natural to ask how $S(\mathcal{B}^{+},\mathbb{P},z)$ and $S(\mathcal{B}^{-},\mathbb{P},z)$ grow asymptotically. It turns out that their behaviour is not that different from the behavior of $S([x,2x],\mathbb{P},z)$, which was determined earlier. Indeed, we have the following lemma.

Lemma 8. There exist functions $g$ and $G$ such that
$\begin{eqnarray}S(\mathcal{B}^{-},\mathbb{P},z)\sim g(s)\frac{D}{\log z},\quad S(\mathcal{B}^{+},\mathbb{P},z)\sim G(s)\frac{D}{\log z}\end{eqnarray}$
when $s=\frac{\log D}{\log z}\geq 1$ is fixed.

Proof. For $z\geq D^{\frac{1}{2}},$ the only numbers we are counting are primes and $1$. Primes have an odd number of prime factors, so we may take $G(s)=\frac{1}{s},g(s)=0$ for $1\leq s\leq 2$. Assume that $g(s)$ and $G(s)$ are defined for $1\leq s\leq k$ and satisfy the claim in this range. We show that $g(s)$ is defined also on $[k+1,k+2];$ the case of $G$ is analogous. Buchstab's identity gives
$\begin{eqnarray}S(\mathcal{B}^{-},\mathbb{P},D^{\frac{1}{s}})=S(\mathcal{B}^{-},D^{\frac{1}{k}})+\sum_{D^{\frac{1}{s}}<p\leq D^{\frac{1}{k}}}S\left(\mathcal{B}^{-}_p,\mathbb{P},\left(\frac{D}{p}\right)^{\frac{\log p}{\log \frac{D}{p}}}\right).\end{eqnarray}$
The set $\mathcal{B}^{-}_p$ actually counts integers with an odd number of prime divisors, so the induction hypothesis tells that
$\begin{eqnarray}S(\mathcal{B}^{-},\mathbb{P},D^{\frac{1}{s}})=(1+o(1))\left(kg(k)\frac{D}{\log D}+\sum_{D^{\frac{1}{s}}<p\leq D^{\frac{1}{k}}}\right) \frac{D}{p\log p}G\left(\frac{\log \frac{D}{p}}{\log p}\right).\end{eqnarray}$
The exact same argument as in Theorem 2 transforms this into
$\begin{eqnarray}(1+o(1))\left(\left(kg(k)+\int_{k}^{s}G(v-1)dv\right)\frac{D}{\log D}\right).\end{eqnarray}$
This means that the definition
$\begin{eqnarray}g(s)=kg(k)+\int_{s}^{k}G(v-1)dv\end{eqnarray}$
is admissible. Of course, $G$ is then defined by
$\begin{eqnarray}G(s)=kG(k)+\int_{k}^s g(v-1)dv,\end{eqnarray}$
and the lemma is proved. ■

From the definitions of $g$ and $G$, it is immediate that $0\leq g(s),G(s)\leq 1$, $g(s)+G(s)=\omega(s)$, $G(s)=\frac{1}{s}$ for $1\leq s\leq 3$ and $g(s)=\frac{\log\frac{s-1}{2}}{s}$ for $2\leq s\leq 3.$ We could now rewrite the main terms of the Rosser-Iwaniec sieve as $2e^{\gamma}|A|W(z)G(s)$ and $2e^{\gamma}|A|W(z)g(s)$. The reason for $e^{\gamma}$ appearing is that $W(z)\sim \frac{e^{-\gamma}}{\log z}$ for $v(d)=1$.

Now that we know how the bounds given by the Rosser-Iwaniec sieve grow asymptotically, we turn to the examples that show its optimality. The formulation of the sieve suggests that the examples are related to $\mathcal{B}^{+}$ and $\mathcal{B}^{-}$, and indeed we may take $A_1=\{n\leq E:\lambda(n)=-1\}$ and $A_2=\{n\leq E':\lambda(n)=1\}$ as the extremal examples, with $E=D\exp(\frac{c}{2}\sqrt{\log D})$. We verify that $A_1$ is an extremal upper bound, the lower bound case of $A_2$ being similar. With the choice $D=z^2$, we have then $W(z)\sim \frac{e^{-\gamma}}{\log z}$ and
$\begin{eqnarray}S(A_1,\mathbb{P},z)\leq \frac{E}{2\log z}\left(\frac{S(\mathcal{B}^{+},\mathbb{P},z)\log z}{\frac{1}{2}D}+o(1)\right)+O\left(\sum_{d<D}|R_d|\right).\quad (4)\end{eqnarray}$
Since $|R_d|=\frac{E}{2d}+\frac{1}{2}\lambda(d)\sum_{n\leq \frac{E}{d}}\lambda(n)$, we find
$\begin{eqnarray}\sum_{d< D}|R_d|&\ll& \sum_{d<D}\left|\sum_{n\leq \frac{E}{d}}\lambda(n)\right|\\&\ll& D+\sum_{d<D}\frac{E}{d}\exp\left(-c\sqrt{\frac{E}{d}}\right)\\&\ll& D+E\exp\left(-\frac{c}{2}\sqrt{\log D}\right)=o\left(\frac{E}{\log D}\right)\end{eqnarray}$
by Lemma 6. Now, denoting again $s=\frac{\log D}{\log z}$, $(4)$ becomes
$\begin{eqnarray}S(A_1,\mathbb{P},z)&\leq& \frac{E}{D}S(\mathcal{B}^{+},\mathbb{P},z)+o\left(\frac{E}{\log D}\right)\\&=&\frac{E}{\log D}(sG(s)+o(1))=\frac{E}{\log D}\left(\frac{\log E}{\log z}G\left(\frac{\log E}{\log z}\right)+o(1)\right)\\&=&S(A_1,\mathbb{P},z)(1+o(1)),\end{eqnarray}$
because $sG(s)=1$ for $1\leq s\leq 3$. The case of $A_2$ is otherwise similar, but we take $D=z^{2+\varepsilon}$ and use $sg(s)=\log(\frac{1+\varepsilon}{2})=(1+o(1))\log(\frac{\frac{\log E}{\log z}-1}{2})$ We see that the estimates lost only a factor of $1+o(1)$. It turns out that the Rosser-Iwaniec sieve is optimal also in the case where the sieve dimension is $\frac{1}{2}$, but the examples are more complicated. In other dimensions, no sieve is known to be optimal.

Derivation of the Rosser-Iwaniec sieve

Proof of the Rosser-Iwaniec sieve. We prove the lower bound sieve, and in the end describe the slight modifications one needs for the upper bound sieve. Define
$\begin{eqnarray}w=\exp\left(\frac{\log D}{(\log_2 D)^2}\right),\quad W_0=\exp\left(\frac{\log D}{\log_2 D}\right),\quad E=D\exp\left(-\frac{2\log D}{\log_2 D}\right).\end{eqnarray}$
We are going to apply Buchstab's identity repeatedly. Two applications give
$\begin{eqnarray}S(A,z)=S(A,w)-\sum_{w\leq p<z}S(A_p,w)+\sum_{w\leq p_2<p_1<z}S(A_{p_1p_2},p_2).\end{eqnarray}$

We use Buchstab's identity twice again to those terms with $p_1p_2^3<E.$ We get
$\begin{eqnarray}S(A,z)&=&S(A,w)-\sum_{w\leq p<z}S(A_p,w)+\sum_{w\leq p_2<p_1<z\atop p_1p_2^3\geq E}S(A_{p_1p_2},p_2)+\sum_{w\leq p_2<p_1<z\atop p_1p_2^3<E}S(A_{p_1p_2},w)\\&-&\sum_{w\leq p_3<p_2<p_1<z\atop p_1p_2^3<E}S(A_{p_1p_2p_3},w)+\sum_{w\leq p_4<p_3<p_2<p_1\atop p_1p_2^3<E}S(A_{p_1p_2p_3p_4},p_4).\end{eqnarray}$

We apply Buchstab's identity twice more to the terms with $p_1p_2p_3p_4^3<E$, and continue in this way, finally ending up with
$\begin{eqnarray}S(A,z)=\sum_{\substack{d\leq E\\d=p_1...p_{\ell}\\w\leq p_{\ell}<...<p_1<z\\p_1...p_{2r}^3<E}}\mu(d)S(A_d,w)+\sum_{r}\sum_{\substack{w\leq p_{2r}<...<p_1<z\atop p_1...p_{2r}^3\geq E\\p_1...p_{2j}^3<E,j<r}}S(A_{p_1...p_{2r}},p_{2r}),\end{eqnarray}$
where $r=\lfloor\frac{\ell}{2}\rfloor$. As $w=W_0^{\frac{1}{u}}$ with $u=\log_2 D$, taking $a=1$ in the Fundamental lemma yields
$\begin{eqnarray}S(A_d,w)&=&\frac{H}{d}W(w)\left(1+O(\exp(2\log_2 w+2\log_2 D)-\log_2 D\log_3 D)) \right)\\&+&O\left(\sum_{\substack{f<W_0w\\\mu(f)\neq 0\\p\mid f\Rightarrow p<w}}|R_{df}|\right)\\&=&\frac{H}{d}W(w)\left(1+O(\exp(-\log_2D(\log_3 D-4))) \right)+O\left(\sum_{\substack{f<W_0w\\\mu(f)\neq 0\\p\mid f\Rightarrow p<w}}|R_{df}|\right)\end{eqnarray}$
where $W(\cdot)$ is as before. Therefore,
$\begin{eqnarray}S(A,z)&=&\sum \mu(d)\frac{H}{d}W(w)+O\left(\sum_{m\leq EW_0w\atop \mu(m)\neq 0}|R_m|\right)+O\left(\sum_{d\leq E}\frac{H}{d}W(w)(\log D)^{-\frac{\log_3 D}{2}}\right)\\&+&\sum_{r}\sum_{\substack{w<p_{2r}<...<p_1<z\\p_1...p_{2r}^3\geq E\\p_1...p_{2j}^3<E,j<r}}S(A_{p_1...p_{2r}},p_{2r})\quad \quad \quad\quad (5)\\&:=&S_1+O(\mathcal{E})+O(H(\log D)^{-2})+S_2,\end{eqnarray}$
because the representation of $m$ as $m=df$ with $f$ having all its prime factors below $w$ and $d$ having its prime factors above $w$ is unique. The last sum $S_2$ is something we cannot estimate further with this method, but luckily we can discard it as it is nonnegative. The loss by doing this is not expected to be overwhelming, since we will soon see that in the case $A=\mathcal{B}^{-}$ it is manageable. When we apply $(5)$ with $A=\mathcal{B}^{-}$, we get
$\begin{eqnarray}S(\mathcal{B}^{-},z)&=&\sum\mu(d)\frac{D}{2d}W(w)+O\left(\sum_{d\leq EW_0w}\left||\mathcal{B}_d^{-}|-\frac{D}{2d}\right|\right)+O(D(\log D)^{-2})\\&+&\sum_{r}\sum_{\substack{w<p_{2r}<...<p_1<z\\p_1...p_{2r}^3\geq E\\p_1...p_{2j}^3<E,j<r}}S(\mathcal{B}^{-}_{p_1...p_{2r}},p_{2r})\quad\quad \quad\quad \quad (6)\\&:=&\sum\mu(d)\frac{D}{2d}W(w)+O(D(\log D)^{-2})+\mathcal{R},\end{eqnarray}$
since the sum over $||\mathcal{B}_d^{-}|-\frac{D}{2d}|$ is already known to contribute $\ll EW_0w+D\exp(-\frac{c}{2}\sqrt{\log D})$. Putting $(5)$ and $(6)$ together, we obtain
$\begin{eqnarray}S(A,z)&\geq& \frac{2H}{D}\sum_{\substack{d\leq E\\d=p_1...p_{\ell}\\w\leq p_{\ell}<...<p_1<z\\p_1...p_{2r}^3<E,2r<\ell}}\mu(d) \frac{D}{2d}W(w)+O(\mathcal{E})+O(H(\log D)^{-2})+O(\frac{H}{D}\mathcal{R})\\&=&\frac{2H}{D}S(\mathcal{B}^{-},z)+O(\mathcal{E})+O(H(\log D)^{-2})+O(\frac{H}{D}\mathcal{R}).\end{eqnarray}$
Therefore, what remains to be shown is $\mathcal{R}:=\sum_{r}a_r\ll \frac{D}{\log D}(\log_3 D)^{-1},$ where $a_r$ is the sum over $p_1,...,p_{2r}$ in the definition of $\mathcal{R}$. Lemma 8 allows us to write
$\begin{eqnarray}a_r&\ll&\sum_{\substack{w\leq p_{2r}<...<p_1<z\\p_1...p_{2r}^3\geq E\\p_1...p_{2j}^3<E,j<r}}\frac{D}{p_1...p_{2r}\log \frac{D}{p_1...p_{2r}}}\frac{\log \frac{D}{p_1...p_{2r}}}{\log p_{2r}}g\left(\frac{\log \frac{D}{p_1...p_{2r}}}{\log p_{2r}}\right)\\&=&\sum_{\substack{w\leq p_{2r}<...<p_1<z\\p_1...p_{2r}^3\geq E\\p_1...p_{2j}^3<E,j<r}}\frac{D}{p_1...p_{2r}\log \frac{D}{p_1...p_{2r}}}\frac{(\log p_1)...(\log p_{2r})}{(\log p_1)...(\log p_{2r})^2}g\left(\frac{\log \frac{D}{p_1...p_{2r}}}{\log p_{2r}}\right)\quad (7)\end{eqnarray}$
Summing by parts similarly to what was done in the proof of Theorem 2, and making the change of variables $p_j=E^{\alpha_j},D=E^{\beta},d\alpha_j=\frac{1}{p_j\log E}dp_j$, we see that
$\begin{eqnarray}a_r&\ll& \frac{D}{\log E}\int_{\substack{\alpha_1+...+3\alpha_{2r}>1\\\alpha_1+...+3\alpha_{2j}<1,j<r\\0<\alpha_{2r}<...<\alpha_{1}<\frac{1}{s}}}g\left(\frac{\beta-\alpha_1-...-\alpha_{2r}}{\alpha_{2r}}\right)\frac{d\alpha_1...d\alpha_{2r}}{\alpha_1...\alpha_{2r}^2}:=I_{r},\end{eqnarray}$
where $s$ is the expression inside $g(\cdot)$. We need the following lemma for estimating the integral.

Lemma 9. Let $J_1(s)=\frac{3}{s}-1,1\leq s\leq 3$,
$\begin{eqnarray}J_n(s)=\frac{1}{s}\int_{A_n(s)}\frac{1}{\alpha_1...\alpha_{n-1}\alpha_{n}^2}d\alpha,\end{eqnarray}$
where $d\alpha=d\alpha_1...d\alpha_n$ and $A_n(s)$ denotes the conditions
$\begin{eqnarray}&&\alpha_1+...+3\alpha_{2r}>1\\&&\alpha_1+...+3\alpha_{2j}<1,j<r\\&&0<\alpha_{2r}<...<\alpha_{1}<\frac{1}{s}.\end{eqnarray}$
Furthermore, let $b_n=3$ for $n$ odd and $b_n=2$ otherwise. Then
$\begin{eqnarray}J_n(s)=\frac{1}{s}\int_{\max\{b_n,s\}}^{n+2}J_{n-1}(t)dt\end{eqnarray}$
and
$\begin{eqnarray}J_n(s) \leq Ac_0^{n}e^{-s}\end{eqnarray}$
for some constants $A$ and $c_0$ with $0<c_0<1.$

Proof. We have
$\begin{eqnarray}\int_{\max\{s,b_n\}}^{n+2}J_{n-1}(t-1)dt=\int_{\max\{s,b_n\}}^{n+2}\frac{1}{t-1}\int_{A_{n-1}(t-1)}\frac{1}{t_1...t_{n-1}^2}dt_1...dt_n.\end{eqnarray}$
With the change of variables $t_j=\frac{t}{t-1}u_j$, we have
$\begin{eqnarray}&&A_{n-1}(t-1)\\&=&\left\{0<u_{n-1}<...<u_{1}<\frac{1}{t},\,\,u_1+...+3u_{n-1}>1-\frac{1}{t},\,\,u_1+...+3u_{2j}< 1-\frac{1}{t}\right\}.\end{eqnarray}$
Denote further $t_1'=\frac{1}{t},t_j'=u_{j-1}$. in these new coordinates, $A_{n-1}(t-1)$ becomes $A_{n}(s),$ so
$\begin{eqnarray}\int_{\max\{s,b_n\}}^{n+2}J_{n-1}(t-1)dt=sJ_n(s)\end{eqnarray}$
for $n\geq 3.$ The choice of $J_1$ was such that this formula holds also for $n=2$. To prove the inequality for $J_n(s)$, we use induction. The case $n=0$ is clear. We prove only the case where $n$ is odd, the even case being similar. We have now
$\begin{eqnarray}J_{2n+1}(s)&=&\frac{1}{s}\int_{\max\{s,3\}}^{2n+3}\frac{1}{t-1}\int_{t-1}^{2n+2}J_{2n-1}(u-1)dudt\\&<&\frac{Ac_o^{2n}}{s}\int_{\max\{s,3\}}^{2n+3}\frac{1}{t-1}\int_{t-1}^{2n+2}e^{1-u}dudt\\&=&\frac{Ac_0^{2n}}{s}\int_{\max\{s,b_n\}}^{2n+3}\frac{e^{-t}}{t-1}dt.\end{eqnarray}$
Define $c_0>0$ as
$\begin{eqnarray}c_0^2=\frac{e^5}{3}\int_{3}^{\infty}\frac{e^{-t}}{t-1}dt<1.\end{eqnarray}$
If $s\leq 3,$ it holds that
$\begin{eqnarray}J_{2n+1}(s)\leq \frac{3Ac_0^{2n+2}}{e^{3}s}\leq Ac_0^{2n+2}e^{-s} e^{-s}.\end{eqnarray}$

If in turn $s>3$, we have
$\begin{eqnarray}\int_{s}^{\infty}\frac{e^{-t}}{t-1}dt<e^{3-s}\int_{3}^{\infty}\frac{e^{-t}}{t-1}dt=3c_0^2e^{-2-s},\end{eqnarray}$
so the claim holds also in this case. ■

We need one more simple lemma concerning the set $A_n(s).$

Lemma 10. Let $\alpha_1,...,\alpha_{2r}>0,\alpha_1\leq \frac{1}{2},\alpha_j\leq \alpha_{j-1}$, and $\alpha_1+...+3\alpha_{2j}<1$ for all $j<r$. Then
$\begin{eqnarray}\sum_{m=1}^{2j+1}\alpha_{m}\leq 1-\frac{1}{2\cdot 3^j}\end{eqnarray}$
for $j\leq r-1.$

Proof. In the case $j=1$, we have $\alpha_1+\alpha_2+\alpha_3\leq \alpha_1+2\frac{1-\alpha_2}{3}\leq 1-\frac{1}{6}$. Assume the case $j$ ($j<r-1$) holds. Then
$\begin{eqnarray}\sum_{m=1}^{2j+3}\alpha_m&=&\alpha_{2j+3}+\alpha_{2j+2}+\sum_{m=1}^{2j+1}\alpha_m\\&\leq& 2\frac{1-\alpha_1-...-\alpha_{2j+1}}{3}+\sum_{m=1}^{2j+1}\alpha_m\\&=&\frac{1}{3}\left(2+\sum_{m=1}^{2j+1}\alpha_m\right)\\&\leq& 1-\frac{1}{2\cdot 3^{r+1}},\end{eqnarray}$
so the case $j+1$ also holds. ■

We will use the two lemmas above to complete the proof of the Rosser-Iwaniec sieve. Notice that if $K$ is a large constant, we have
$\begin{eqnarray}&&\sum_{r>K\log_4 D}a_{r}\ll \sum_{r>K\log_4 D}I_{r}\ll \frac{D}{\log E}\sum_{r>K\log_4 D}c_0^{2r}\\&&\ll \frac{D}{\log E}c_0^{2K\log_4 D}\ll \frac{D}{\log E}(\log_3 D)^{-1}\end{eqnarray}$
by Lemma 9, so we may concentrate on the terms $a_r$ with $r\leq K\log_4 D$. The crucial observation that saves us the factor $(\log_3 D)^{-1}$ is that we we may assume in the last sum in $(6)$ the additional condition $p_1...p_{2r}^3\leq D;$ otherwise $S(\mathcal{B}^{-}_{p_1...p_{2r}},p_{2r})=0$ (this was the reason for considering $p_1...p_{2r}^3$ in the first place). This means that in the definition of $I_{r},$ we may add the condition $\alpha_1+...+3\alpha_{2r}<\beta$, where $\beta=\frac{\log D}{\log E}$. If we set $\nu=(\log_2 D)^{-1}$, we have $\beta=1+2\nu+O(\nu^2)$ from the definition of $E$. Thus, in the definition of $I_{r}$ we integrate first over $\alpha_{2r}$, which lies on a range of length $O(\nu)$, when $\alpha_1,...,\alpha_{2r-1}$ are fixed. Using the bound $\alpha_{j}\geq \alpha_{2r}\geq \frac{1}{2\cdot 3^r}$ coming from Lemma 10, we see that
$\begin{eqnarray}I_{r}&\ll& \frac{D}{\log E}\int_{\substack{\alpha_1+...+3\alpha_{2j}<1,j<r\\\frac{1}{2\cdot 3^r}<\alpha_{2r-1}<...<\alpha_1<\frac{1}{s}}}\frac{1}{\alpha_1...\alpha_{2r-1}}(2\cdot 3^r)^2\cdot \nu\,\, d\alpha_1...d\alpha_{2r-1}\\&\ll& \frac{\nu D}{\log E}(2^{2r+1}\cdot 3^{r(2r+1)}).\end{eqnarray}$
When this is summed over $r\leq K\log_4 D$, it is seen that
$\begin{eqnarray}\sum_{r\leq K\log_4 D}a_r\ll \sum_{r\leq K\log_4 D}I_{r}\ll \frac{\nu D}{\log E}3^{3(K\log_4 D)^2}\ll \frac{D}{\log D}(\log_3 D)^{-1},\end{eqnarray}$
which was to be shown.

The proof of the upper bound sieve is otherwise analogous, but the conditions we create with Buchstab's identity are of the form $p_1...p_{2j-1}^2>E.$ This leads to the term corresponding to the sum $\mathcal{R}$ in $(6)$ being negative, so it may be neglected. Lemmas 9 and 10 take a very similar, though not identical, form, and the error terms are bounded as before, but in the end the extra condition for $\alpha_i$ is that $\alpha_1+...+2\alpha_{2r-1}\in (1, \beta)$, and this allows us to save a factor of $\nu$. ■

An applicaton to twin almost primes

The Rosser-Iwaniec sieve gives us right away some quite strong consequences. For example, the lower bound case gives us a result on twin almost primes. We do not quite get Chen's theorem that $p+2$ is infinitely often a semiprime, when $p$ ranges through the prime, but we get a version where semiprimes are replaced by products of at most three primes.

Theorem 11. There are infinitely many primes $p$ such that $p+2$ has at most $3$ prime factors.

Proof. We take $A=\{x\leq p+2\leq 2x:\,p\in \mathbb{P}\setminus\{2\}\},\,D=x,\,z=\frac{x^{\frac{1}{4}}}{\log^{10}x}$. We have
$\begin{eqnarray}|A_d|&=&\pi(2x;d,-2)-\pi(x;d,-2)+O(1)\\&=&\frac{\text{Li}(2x)-\text{Li}(x)}{\varphi(d)}+\left(\pi(2x;d,-2)-\frac{\text{Li}(2x)}{\varphi(d)}\right)-\left(\pi(x;d,-2)-\frac{\text{Li}(x)}{\varphi(d)}\right)+O(1).\end{eqnarray}$
The Bombieri-Vinogradov theorem from this post states that
$\begin{eqnarray}\sum_{d\leq \frac{x^{\frac{1}{2}}}{\log^C x}}\max_{y\leq x}\max_{a\in \mathbb{Z}_{q}^{\times}}\left|\pi(y;d,a)-\frac{\text{Li}(y)}{\varphi(d)}\right|\ll \frac{x}{\log^{C-6} x}\end{eqnarray}$
for any $C>0$. Taking $C=10$, we see that
$\begin{eqnarray}S\left(A,\mathbb{P},\frac{x^{\frac{1}{4}}}{\log^{10} x}\right)&\gg& \frac{x}{\log x}\prod_{p\leq x^{\frac{1}{4}}}\left(1-\frac{1}{p}\right)g\left(\frac{\log D}{\log z}\right)+O\left(\frac{x}{\log^4 x}\right)\\&\gg& \frac{x}{\log^2 x}\quad (8).\end{eqnarray}$
This already shows that $A$ contains $\gg \frac{x}{\log^2 x}$ numbers that are products of at most four primes, because if $p_1,...,p_5>\frac{x^{\frac{1}{4}}}{\log^{10}x}$, then $p_1...p_5>2x$ for large $x$. If we show that
$\begin{eqnarray}\left|\left\{(p_1,p_2.p_3,p_4):\,p_1p_2p_3p_4\in [x,2x],\, p_i>\frac{x^{\frac{1}{4}}}{\log^{10}x}\right\}\right|=o\left(\frac{x}{\log^2 x}\right),\end{eqnarray}$
we are done, because then $A$ contains many products of at most three primes. To estimate the quantity $(8)$, we split $p_1,...,p_4$ into dyadic intervals and obtain
$\begin{eqnarray}(8)&\ll& \sum_{k_1+k_2+k_3+k_4=\log_2 x\atop k_1,k_2,k_3,k_4>\log_2 F(x)}|\{(p_1,p_2,p_3,p_4):p_1\sim 2^{k_1},p_2\sim 2^{k_2},p_3\sim 2^{k_3},p_4\sim 2^{k_4}\}\\&\ll& \sum_{k_1+k_2+k_3+k_4=\log_2 x\atop k_1,k_2,k_3,k_4>\log_2 F(x)}\frac{2^{k_1+k_2+k_3+k_4}}{k_1k_2k_3k_4}\\&\ll& \frac{x}{\log^4 x}|\{(k_1,k_2,k_3,k_4):\,k_1,k_2,k_3,k_4>\log_2 F(x),\,k_1+k_2+k_3+k_4=\log_2 x\}|\\&=&\frac{x}{\log^4 x}|\{(k_1',k_2',k_3',k_4'):\,k_1',k_2',k_3',k_4'>0,\,k_1'+k_2'+k_3'+k_4'=\log_2 \frac{x}{F(x)^4}\}|\\&\ll& \frac{x}{\log^4 x} (\log \log x)^4=o\left(\frac{x}{\log^2 x}\right),\end{eqnarray}$
where $F(x)=\log_2\frac{x^{\frac{1}{4}}}{\log^{10} x}$ and $X\sim Y$ denotes $Y\leq X\leq 2Y$. We have in fact shown that the interval $[x,2x]$ contains $\gg \frac{x}{\log^2 x}$ primes $p$ such that $p+2$ has at most three prime divisors. This is the correct order of magnitude, as is seen from the upper bound of the Rosser-Iwaniec sieve. The proof is finished. ■