**Choosing an auxiliary function**
We follow mostly Halberstam and Richert's book, using the notations of the previous post about Brun sieve. In that post, we found out that if the function $\sigma(n):=\sum_{d\mid n}\mu(d)\chi(d)$ approximating the characteristic function of $n=1$ satisfies $\sigma(n)\geq 0$ or $\sigma(n)\leq 0$ for all $n$, then

\[\begin{eqnarray}S(A,P,z)&\leq&xW(z)\left(1+\sum_{1<d\mid P(z)}\sigma(d)g(d)\right)+\sum_{d\mid P(z)}|\chi(d)||R_d|\quad (1)\end{eqnarray}\]

or

\[\begin{eqnarray}S(A,P,z)\geq xW(z)\left(1+\sum_{1<d\mid P(z)}\sigma(d)g(d)\right)-\sum_{d\mid P(z)}|\chi(d)||R_d|,\quad (2)\end{eqnarray}\]

respectively. We noticed that the very simple auxiliary function $\chi(d)=1_{\{\nu(d)\leq r\}}$ gives rise to such a function $\sigma$, and from this we derived a combinatorial sieve with error term $O(z^{\log \log z})$. It turns out, however, that the choice was not optimal, and a more clever choice of $\chi(d)$ gives the desired error term of $O(z^K)$ for some $K$.

Let us consider what would be a better auxiliary function $\chi$. In the basic Brun sieve, the first error term

\[\begin{eqnarray}\sum_{1<d\mid P(z)}\sigma(d)g(d)\end{eqnarray}\]

was $o(1)$ as long as $r>C\log \log z$ for large $C$, owing to the fact that we had $\sigma(n)=0$ if $0<\nu(n)\leq r$. If this additional condition is fulfilled, the problematic error term is the second one,

\[\begin{eqnarray}\sum_{d\mid P(z)}|\chi(d)||R_d|.\end{eqnarray}\]

Fortunately, this error can be estimated much more sharply if we choose a different function $\chi$ for which $\sigma(n)$ still has constant sign. We assume $|R_d|\leq \omega(d)$ (this is usually satisfied, but there are also important exceptions), and it is also reasonable to assume $\chi(d)\in \{0,1\}$. Now if $\chi(d)=0$ for all $d\mid P(z)$ except for those having at most $m$ prime factors above $z'$, where $m\in \mathbb{Z}_+$ and $z'<z$ are parameters, we find

\[\begin{eqnarray}\sum_{d\mid P(z)}|\chi(d)|\omega(d)\leq \left(\sum_{p\leq z'}\omega(p)\right)^{M-m}\left(\sum_{p\leq z}\omega(p)\right)^m,\end{eqnarray}\]

where $M$ is the maximal number of prime factors of $d$ such that $\chi(d)\neq 0$ can hold. We suppose as before that $\omega(p)\leq A$ for some constant $A$, so the error is at most $(Az')^{M-m}z^m$. If for example $M=r$ (but $\chi$ is not supposed to be $1_{\nu(d)\leq r}$) and $m=1$, the error is at most $(Az')^{r-1}z$. If $z'$ is a lot smaller than $z$, say $z'=\sqrt{z},$ then the error is at most $(Az)^{\frac{r-1}{2}}z$. Our original error in the basic Brun sieve was $(Az)^r$, so we have basically cut the error to its square root (given that a function $\chi$ satisfying the required properties exist). The error is nevertheless too large, so we want to iterate the procedure. If we demand $d$ to have at most $k$ prime factors above $z^{\frac{1}{2^k}}$ for all $k$, and at most $r$ in total, the error is bounded by

\[\begin{eqnarray}\left(\sum_{p\leq z\atop}\omega(p)\right)\left(\sum_{p\leq z^{\frac{1}{2}}}\omega(p)\right)\left(\sum_{p\leq z^{\frac{1}{4}}}\omega(p)\right)...\leq (Az)^{1+\frac{1}{2}+\frac{1}{4}+...}=(Az)^2,\end{eqnarray}\]

so we indeed get an error term that is only a power of $z$. Our task now is to do the above argument rigorously, and with suitable parameters (we do not necessarily want to take square roots).

We use the notation $\nu_{a}(d)$ for the number of prime factors of $d$ that are at least $a$. We define two functions $\chi_1,\chi_2$ in terms of the sequence $z_r,z_{r-1},...z_1,z_0$, where $z_k=z^{e^{-\lambda (r-k)}}$ for a parameter $\lambda \in (0,1)$ and $z_r=z,z_0=2,z_{i+1}>z_i$. For $d\mid P(z)$ set

\[\begin{eqnarray}\chi_1(d)&=&1\quad \text{if}\quad \nu_{z_k}(d)\leq 2(r- k) \quad \text{for all}\quad 0\leq k\leq r-1\\\chi_2(d)&=&1\quad \text{if}\quad \nu_{z_k}(d)\leq 2(r-k)-1 \quad \text{for all}\quad 0\leq k\leq r-1\end{eqnarray}\]

and $\chi_1(d)=0,\chi_2(d)=0$ in all the other cases. This is almost what we did heuristically above. Now we prove a crucial lemma.

**Lemma 1.**If $\sigma_j(n)=\sum_{d\mid n}\mu(d)\chi_j(d)$ for $j=1,2$, then $\sigma_j(1)=1$ and $\sigma_1(n)\geq 0,\sigma_2(n)\leq 0$ for all square-free $n>1$.

**Proof.**Clearly $\sigma_j(1)=1$. We give a simple sufficient criterion for $\sigma(n)$ to have constant sign, in terms of $\chi$. If $p$ is a prime divisor of $n$, then

\[\begin{eqnarray}\sigma(n)&=&\sum_{d\mid \frac{n}{p}}\mu(d)\chi(d)+\sum_{d\mid \frac{n}{p}}\mu(pd)\chi_1(pd)\\&=&\sum_{d\mid \frac{n}{p}}\mu(d)(\chi_1(d)-\chi_1(p d)).\end{eqnarray}\]

In particular, we can choose $p=q(n)$, the smallest prime divisor of $n$. As $n$ is square-free, we have $q(d)>q(n)$, so it is enough to have

\[\begin{eqnarray}\mu(d)(\chi(d)-\chi(pd))\geq 0\quad \text{for all primes}\quad p<q(d).\end{eqnarray}\]

We check this criterion for the functions $\chi_1$ and $\chi_2$. Trivially $\chi_j(d)\geq \chi_j(pd)$, so it is enough to show that if $\chi_j(d)=1, \chi_j(pd)=0$, then $\mu(d)=(-1)^{j+1}$. Let $p\in [z_k,z_{k+1})$. As $p<q(d)$, we have $\nu(d)=\nu_{z_k}(d),$ and this must be equal to $2(r-k)$ if $\chi_1(d)=1$ and $\chi_1(pd)=0.$ Therefore $\nu(d)$ is even and $\mu(d)=1$. Similarly, we see in the case of $\chi_2$ that $\mu(d)=-1$. Now the criterion is verified, so the lemma is proved. ■

**Estimating the error terms**
Now we are allowed to use the formula $(1)$ or $(2)$, so we turn to error terms.

**Lemma 2.**If $z_k$ are as before ($\chi_1$ and $\chi_2$ are determined by them), we have

\[\begin{eqnarray}\sum_{d\mid P(z)}\chi_1(d)|R_d|\leq (Az)^{\frac{2}{1-e^{\lambda}}}.\end{eqnarray}\]

\[\begin{eqnarray}\sum_{d\mid P(z)}\chi_2(d)|R_d|\leq (Az)^{\frac{2}{1-e^{\lambda}}-1}.\end{eqnarray}\]

**Proof.**This follows just as in our heuristic. Indeed,

\[\begin{eqnarray}\sum_{d\mid P(z)}\chi_1(d)|R_d|&\leq& \left(1+\sum_{p\leq z}\omega(p)\right)^2\left(1+\sum_{p\leq z^{e^{-\lambda}}}\omega(p)\right)^2\left(1+\sum_{p\leq z^{e^{-2\lambda}}}\omega(p)\right)^2...\\&\leq& (Az)^{2+2e^{-\lambda}+2e^{-2\lambda}+...}=(Az)^{\frac{2}{1-e^{-\lambda}}}.\end{eqnarray}\]

and the other estimate follows similarly. ■

**Lemma 3.**Assume $\omega(p)\leq A$, and let $\lambda>\frac{1}{Ae}$ be a number such that

\[\begin{eqnarray}\kappa_A:=(1+10^{-10})\frac{A}{2}\lambda e^{1+\frac{A}{2}\lambda}\in (0,1).\end{eqnarray}\]

Then, for large values of $z$, it holds that

\[\begin{eqnarray}\sum_{1<d\mid P(z)}|\sigma_j(d)|g(d)\ll_A \kappa_A^{2\log \log z}\quad (3).\end{eqnarray}\]

**Proof.**Let $j=1;$ the other case is similar. We estimate the terms in the sum on the left of $(3)$ depending on $q(d)$. We do this because for those $d$ for which $\nu(d)$ is large (say $\geq r$), we can estimate essentially as we did in the previous post, but for small values of $\nu(d)$ we have to be careful so that our upper bound is not automatically $\gg 1$ (if $\nu(d)$ is small, we will see that $q(d)$ is large so that $g(d)$ is small). We find

where we obtained $(4)$ from the proof of Lemma 1, $(5)$ as in the previous post, and $(6)$ by writing $d=q(d)d'$ with $q(d')>q(d)$ and applying the multiplicativity of $g$. We estimate $(6)$ further by

\[\begin{eqnarray}&\leq& \sum_{t=0}^{r-1} \frac{(A+1)^2}{z_t}\sum_{m=2(r-t)}^{\infty} \binom{m}{2(r-t)}\frac{1}{m!}\left(\sum_{z_t\leq z\leq z}g(p)\right)^m\\&=&\sum_{t=0}^{r-1}\frac{(A+1)^2}{z_t}\left(\sum_{z_t\leq p\leq z} g(p)\right)^{2(r-t)}\exp\left(\sum_{z_t\leq p\leq z}g(p)\right)\cdot \frac{1}{(2(r-t)!)}\\&\ll_A& \sum_{t=0}^{r-1}\frac{1}{z_t}\left(\sum_{z_t\leq p\leq z}\frac{A}{p}+c_A\right)^{2(r-t)}\exp\left(\sum_{z_t\leq p\leq z}\frac{A}{p}\right)\cdot \frac{1}{(2(r-t))!}\quad (7)\\&\ll_A& \sum_{t=0}^{r-1}\frac{1}{z_t} \left(\frac{Ae(r-t)\lambda+c_A}{2(r-t)}\right)^{2(r-t)}e^{A(r-t)\lambda}\quad\quad \quad \quad \quad \quad \quad \quad \quad (8)\\&\leq& \max_{0\leq t\leq r-1} \frac{1}{z_t}\left(\frac{Ae\lambda}{2}+\frac{c_A}{2(r-t)}\right)^{2(r-t)}e^{A(r-t)\lambda}\quad\quad \quad \quad \quad \quad \quad \quad \,\,\, \,(9).\end{eqnarray}\]

The first intermediate steps are just as for the basic Brun sieve. For $(7)$ we used

\[\begin{eqnarray}\sum_{z_t\leq p\leq z}g(p)=\sum_{z_t\leq p\leq z}\left(\frac{\omega(p)}{p}+\frac{\omega(p)^2}{p(p-\omega(p))}\right)\leq \sum_{z_t\leq p\leq z}\frac{\omega(p)}{p}+c_A.\end{eqnarray}\]

For $(8)$ we used Mertens' estimates (see this post) and Stirling's approximation.

When $r-t\leq 10^{10}\cdot\frac{c_A}{Ae}$, the expression $(9)$ is bounded by

\[\begin{eqnarray}\ll_A z^{-exp(-\alpha_A \lambda)}\left(\frac{Ae\lambda}{2}+\frac{1}{2Ae}\right)^{2\beta_A}=o(z^{-\varepsilon_A}),\end{eqnarray}\]

where $\alpha_A,\beta_A,\varepsilon_A>0$ are suitable constants. For other values of $r-t$, $(9)$ is at most

\[\begin{eqnarray}\frac{1}{z_t}\kappa_A^{2(r-t)},\end{eqnarray}\]

where $\kappa_A$ is asin the theorem. Hence we want to maximize over $[0,r]$ the function $f(y)=-e^{-\lambda y}\log z+2y\log \kappa_A$, where $\kappa_A\in (0,1)$ by assumption. We have $f(0)=-\log z,f(r)=-1+2r\log \kappa_A$. Also $f'(y)=0$ if and only if $y=\frac{1}{\lambda}\log\frac{2\kappa_A}{\lambda \log z}=\frac{1}{\lambda}\log \log z+O_A(1)$. For this value of $y=r-t$, the expression $(9)$ becomes

\[\begin{eqnarray}\ll_A \kappa_A^{\frac{2\log \log z}{\lambda}}\quad \quad (10),\end{eqnarray}\]

and since $2=z^{e^{-\lambda r}}$, we have $r=\frac{\log \log z}{\lambda}+O(1)$, so $(10)$ is the maximum of $(9)$ up to a constant factor. This completes the proof. ■

Combining the estimates above, we have the refined Brun sieve.

**Theorem 4 (refined Brun sieve).**Let $\omega(p)\leq A$, and let $\lambda\in (0,1)$ be such that $\kappa_A\in (0,1)$, where $\kappa_A$ is defined in Lemma 3. Then, for some $C_A>0,$

\[\begin{eqnarray}S(A,P,z)&\leq& xW(z)(1+C_A\cdot \kappa_A^{\frac{2\log \log z}{\lambda}})+(Az)^{\frac{2}{1-e^{-\lambda}}},\\S(A,P,z)&\geq& xW(z)(1-C_A\cdot \kappa_A^{\frac{2\log \log z}{\lambda}})+(Az)^{\frac{2}{1-e^{-\lambda}}-1}.\end{eqnarray}\]

**Proof.**This follows by combining Lemmas 1, 2 and 3. ■

**Applications**
As the error term in the sieve is a power of $z$, we may consider various problems about almost primes. We start by considering twin almost primes.

We choose $A=\{n(n+2): n\leq x\}, P=\mathbb{P}\setminus\{2\}, \omega(p)=2$. Then $A=2$ and we may choose $\lambda=0.2784$ (we want $\lambda$ to be as large as possible). Then $\frac{2}{1-e^{-\lambda}}=8.2303...$, so

\[\begin{eqnarray}(1-o(1))xW(z)+O(z^{7.3})\leq S(A,P,z)\leq(1+o(1))xW(z)+O(z^{8.3})\end{eqnarray}\]

Choosing $z=2x^{\frac{1}{8}}$, we have $S(A,P,2x^{\frac{1}{8}})\gg\frac{x}{\log^2 x}$, and similarly $S(A,P,2x^{\frac{1}{9}})\ll \frac{x}{\log^2 x}$. The values of $n$ counted by $S(A,P,2x^{\frac{1}{8}})$ are such that $n$ and $n+2$ have at most $7$ prime factors; otherwise $n$ or $n+2$ would have a prime factor below $x^{\frac{1}{8}}$. Furthermore, $S(A,P,2x^{\frac{1}{9}})$ counts the twin primes on $(x^{\frac{1}{9}},x]$ among some other numbers, so $\pi_2(x)\ll \frac{x}{\log^2 x}$, where $\pi_2$ is the counting function of twin primes. In conclusion, we have the following theorems.

**Theorem 5.**There are infinitely many integers $n$ such that $n$ and $n+2$ have at most $7$ prime factors, counted with multiplicities.

**Theorem 6.**We have $\pi_2(x)\ll \frac{x}{\log^2 x}$.

The latter estimate is believed to be optimal up to a constant factor (and it would be if we could have an error term of $o(z^2)$ in the computations above), but of course we do not even know whether $\pi_2(x)$ is unbounded.

Next we consider Goldbach's problem for almost primes. Let $x$ be any large even integer, and set $A=\{n(x-n):n\leq x\}, P=\mathbb{P}, \omega(p)=2$ if $p\nmid x$, $\omega(p)=1$ if $p\mid x$. Then again $A=2$ and we may choose $\lambda=0.2784$. We get, as before,

\[\begin{eqnarray}S(A,P,2x^{\frac{1}{8}})\gg\frac{x}{\log^2 x}, \quad S(A,P,2x^{\frac{1}{9}})\ll \prod_{p\mid x}\left(1-\frac{1}{p}\right)\prod_{p\nmid x\atop p\leq x}\left(1-\frac{2}{p}\right).\end{eqnarray}\]

The quanity $S(A,P,2x^{\frac{1}{8}})$ counts integers $n\in [1,x]$ such that $n$ and $x-n$ have at most $7$ prime factors. On the other hand, $S(A,P,2x^{\frac{1}{9}})$ is an upper bound for $N(x)$, the number of representations of $x$ as a sum of two primes. To summarize,

**Theorem 7.**Every large enough even integer is the sum of two numbers having at most $7$ prime factors.

**Theorem 8.**We have $N(x)\ll \frac{x}{\log^2 x}\prod_{p\mid x}\frac{p-1}{p-2}$, where $N(\cdot)$ is as before.

Next we consider primes represented by polynomials. Let $f(x)\in \mathbb{Z}[x]$ be a non-constant irreducible polynomial with positive leading coefficient, of degree $d$. In what follows, implicit constants may depend on the coefficients of $f$ but not on $d$. Let $s(f,p)$ be the number of solutions to $f(m)\equiv 0 \pmod p$, and suppose that $s(f,p)<p$ for all $p$; that is, $f$ has no fixed prime divisor. By Lagrange's theorem, $s(f,p)\leq d$. Clearly we also have $|A_d|=\frac{s(f,p)}{p}|A_p|+R_p, |R_p|\leq \omega(p)$, and by the Chinese remainder theorem the same holds for any $d\mid P(z)$. We can choose $A=d$, so we need to choose $\lambda$ such that

\[\begin{eqnarray}(1+10^{-10})\frac{d}{2}\lambda e^{1+\frac{d}{2}\lambda}\in(0,1).\end{eqnarray}\]

If $u=\frac{d}{2}\lambda$, we may take $u$ to be the solution of $ue^{u}=\frac{1-10^{-10}}{e}$, which is $0.2784...$. Then $\lambda=\frac{0.5569...}{d}$ and $\frac{2}{1-e^{-\lambda}}\leq 4.31d$ (by applying $e^{-t}\leq 1-t+\frac{t^2}{2}$ for $t\geq 0$). Now the Brun sieve gives

\[\begin{eqnarray}S(A,P,x^{\frac{1}{4.31d-0.1}-1})\gg \frac{x}{\log^d x}, \quad S(A,P,x^\frac{1}{0.431d-0.1})\ll x\prod_{p\leq x\atop s(f,p)>0}\left(1-\frac{1}{p}\right)\end{eqnarray}\]

(the upper bound is rather rough). By the Chebotarev density theorem, $s(f,p)>0$ happens for the proportion $\frac{G_P}{\#G}$ of primes (in the sense of the asymptotic density), where $G$ is the Galois group of the field extension of $\mathbb{Q}$ arising from $f$, and $G_P$ is the number of those permutations $\sigma \in G$ ($G$ is a subgroup of the permutation group $S_d$) with just one cycle. Let $S(d)$ be the number of permutations of $d$ elements with just one cycle. Then $s(f,p)>0$ for at least density $\frac{S(d)}{d!}\leq \frac{1}{d}$ of primes. By partial summation, we see that

\[\begin{eqnarray}\sum_{p\leq x\atop s(f,p)>0}\frac{1}{p}\geq (1-o(1))\cdot \frac{1}{d}\log \log x.\end{eqnarray}\]

Therefore

\[\begin{eqnarray}S(A,P,x^{\frac{1}{4.31d-0.1}})\ll \frac{x}{\log^{\frac{1}{d}} x}.\end{eqnarray}\]

We obtained the following theorems.

**Theorem 9.**If $f$ is a polynomial as above, there are infinitely many positive integers $n$ such that $f(n)$ has at most $\frac{1}{\frac{1}{4.31d-0.1}-1}$ prime divisors, counted with multiplicities. In particular, if $f$ is a quadratic polynomial satisfying the conditions, and in the special case $d=2$, there are infinitely many positive integers $n$ with $f(n)$ having at most $7$ prime factors with multiplicities (since then we have $\omega(p)\leq 2$, so the error term is the same as in the twin prime problem).

**Theorem 10.**If $f$ is a polynomial as above, there are $\ll \frac{x}{\log^{\frac{1}{d}}x}$ primes of the form $f(n)$ with $n\in[1,x]$, where the implicit constant may depend on the coefficients of $f$ but not on $d$.