We are going to prove the theorems mentioned above. This post deals only with positive results on convergence, as negative results will be considered in a later post. One of course has to mention the marvelous results by Carleson from 1966 that the Fourier series of an $L^2$ function converges pointwise almost everywhere, and by Hunt from 1968 that the same folds for $L^p$ functions $1<p<\infty$. The proofs are notoriously difficult, and we will not address them.
Dirichlet and Fejér kernels
When studying the convergence properties of Fourier series, we of course want to understand the partial sums
$$S_N(f,x):=\sum_{n=-N}^{N}\hat{f}(n)e^{-inx}, \quad f\in L^1([0,1]).$$A key idea in making progress in convergence results is to notice that the linear operators $S_n(f,\cdot)$ arise from convolution by kernels called the Dirichlet kernels.
Theorem 1. Let $D_N(x):=\sum_{n=-N}^N e^{2\pi inx}$ be the Dirichlet kernel (note that in some sources, $2\pi$ is omitted). Then for $1$-peridoic functions $f\in L^1([0,1])$,
\[\begin{eqnarray}S_N(f,x)=f*D_N(x):=\int_{0}^{1}f(y)D_N(x-y)dy,\end{eqnarray}\]
and $D_N$ satisfies $D_N(x)=\frac{\sin((2N+1)\pi x)}{\sin(\pi x)}$.
Proof. We compute using the definition of Fourier coefficients:
\[\begin{eqnarray}S_N(f,x)=\int_ {0}^1\sum_{n=-N}^{N}f(y)e^{2\pi i n (x-y)}dy=\int_ {0}^1f(y)D_N(x-y)dy.\end{eqnarray}\]The alternative representation of $D_N$ follows by evaluating a geometric series:
\[\begin{eqnarray}\sum_{n=-N}^{N}e^{inx}&=&1+2\Re\left(e^{2\pi i x}\sum_{n=0}^{N-1}e^{2\pi inx}\right)\\&=&1+2\Re\left(\frac{e^{2\pi i (N+1)x}-e^{2\pi i x}}{e^{2\pi i x}-1}\right)\\&=&1+2\Re\left(\frac{e^{(2N+1)\pi i x}-e^{\pi i x}}{e^{\pi i x}-e^{-\pi i x}}\right)\\&=&\frac{\sin((2N+1)\pi x)}{\sin(\pi x)}.\end{eqnarray}\]This completes the proof. ■
Now the study of Fourier series has more or less been reduced to the study of the $1$-periodic kernels $D_N$. A natural question to ask is, what makes a sequence of kernels (functions) $(K_n)_{n=1}^{\infty}$ nice in terms of convergence properties. The following properties turn out to be the most crucial.
Definition 1. A sequence $(K_n)_{n=1}^{\infty}$ of $1$-periodic functions is called a good family of kernels (other names include approximation to identity) if the following hold:
(i) $\int_{-\frac{1}{2}}^{\frac{1}{2}}K_n(x)dx=1$ for all $n$ (average equal to $1$)
(ii) There is a constant $M$ such that $\int_{-\frac{1}{2}}^{\frac{1}{2}}|K_n(x)|dx\leq M$ (uniform boundedness of $L^1$-norms)
(iii) For every $\varepsilon>0$, it holds that $\int_{[-\frac{1}{2},\frac{1}{2}]\setminus [-\varepsilon,\varepsilon]} |K_n(x)|dx\to 0$ as $n\to \infty$ (mass centers to the origin).
With these definitions, we have
Theorem 2. For a good family of kernels and for any $f\in L^p,1\leq p<\infty$, we have $\|f-f*K_n\|_p\to 0$ as $n\to \infty$. Moreover, if $f\in L^{\infty}$, we have $f*K_n(x)\to f(x)$ as $n\to \infty$ at the continuity points of $f$. If $f$ is everywhere continuous, this convergence is uniform.
Proof (outline). Let us start with the case $f\in L^{\infty}$. Using property (i) of the kernels,
\[\begin{eqnarray}|f(x)-f*K_n(x)|&=&\left|\int_{-\frac{1}{2}}^{\frac{1}{2}}(f(x)-f(x-y))K_n(y)dy\right|\\&\leq&\int_{-\delta}^{\delta}|f(x)-f(x-y)||K_n(y)|dy+\int_{\delta<|y|<\frac{1}{2}}|f(x)-f(x-y)||K_n(y)|dy.\end{eqnarray}\]for any $\delta\in (0,\frac{1}{2}).$ If $f$ is continuous at $x$ and $\varepsilon>0$, we have $|f(x)-f(x-y)|<\varepsilon$ when $|y|<\delta$ for small enough $\delta$. Therefore, the first inegral is at most $C\varepsilon$ for some $C$ by property (ii). The second integral is bounded by
\[\begin{eqnarray}2\|f\|_{\infty}\int_{\delta<|y|<\frac{1}{2}}|K_n(y)|dy,\end{eqnarray}\]and this is at most $\varepsilon$ for large enough $n$ by (iii). One sees by uniform continuity that if $f$ is everywhere continuous, the convergence $f*K_n\to f$ is even uniform. This completes the proof of the $L^{\infty}$ case.
The case $f\in L^p$, $1\leq p<\infty$, is significantly more complicated, and is not so important for us, as unfortunately $(D_N)$ is not a good family of kernels. We skip the proof of the following lemma from real analysis:
\[\begin{eqnarray}A_f(x):=\int_{-\frac{1}{2}}^{\frac{1}{2}}|f(x-y)-f(y)|^p\to 0\end{eqnarray}\]as $x\to 0$ for $f\in L^p$ (the proof is based on reduction to the case of continuous functions using Lusin's theorem). We use the duality of $L^p$-spaces to write
\[\begin{eqnarray}\int_{-\frac{1}{2}}^{\frac{1}{2}}|f(x)-f*K_n(x)|^p dx&=&\sup_{\|\psi\|_q\leq 1}\int_{-\frac{1}{2}}^{\frac{1}{2}}\psi(x)\left|\int_{-\frac{1}{2}}^{\frac{1}{2}}(f(x)-f(x-y))K_n(y)dy\right|dx\\&\leq& \sup_{\|\psi\|_q\leq 1}\int_{-\frac{1}{2}}^{\frac{1}{2}}|\psi(x)|\int_{-\frac{1}{2}}^{\frac{1}{2}}|f(x)-f(x-y)||K_n(y)|dy\,dx\\&=& \sup_{\|\psi\|_q\leq 1}\int_{-\frac{1}{2}}^{\frac{1}{2}}\int_{-\frac{1}{2}}^{\frac{1}{2}}|\psi(x)||f(x)-f(x-y)|dx|K_n(y)|dy\end{eqnarray}\]We changed the order of integration with the Fubini-Tonelli theorem. By Hölder's inequality, the inner integral is bounded by $\|\psi\|_qA_f(y)^{\frac{1}{p}}\leq A_f(y)^{\frac{1}{p}}.$ Therefore the whole expression is at most
\[\begin{eqnarray}\int_{-\frac{1}{2}}^{\frac{1}{2}}A_f(y)^{\frac{1}{p}}|K_n(y)|dy.\end{eqnarray}\]Since $A_f(y)^{\frac{1}{p}}\to 0$ as $y\to 0$, the same argument as in the $L^{\infty}$ case of the previous theorem shows that this integral approaches zero as $n\to \infty$. ■
As already mentioned, the Dirichet kernels fail to be a good family of kernels. The theorem above shows that otherwise Fourier analysis would be easy. We will later discuss the divergence of Fourier series, and understand what exactly prevents $(D_N)$ from being a good family of kernels.
Even though $(D_N)$ is not a good family of kernels, it is possible to ''smoothen'' it to make it one. This was Fejér's observation and leads to his theorem.
Theorem 3 (Fejér). Let $f$ be $1$-periodic and integrable on $[0,1].$ Then the Fourier series of $f$ converges on average:
\[\begin{eqnarray}\frac{1}{N+1}\sum_{n=0}^{N}S_n(f,x)\to f(x)\end{eqnarray}\] whenever $f$ is continuous at $x$. If $f$ is everywhere continuous, the convergence is uniform.
Proof. Define the Fejér kernels
\[\begin{eqnarray}F_N(x)&=&\frac{1}{N+1}\sum_{n=0}^{N}D_n(x)\\&=&\frac{1}{N+1}\sum_{n=0}^{N}\sum_{k=-n}^{n}e^{2\pi ikx}\\&=& \frac{1}{N+1}\sum_{k=-N}^{N}(N+1-k)e^{2\pi ikx}\\&=&\frac{1}{N+1}\left(\sum_{k=0}^N e^{2\pi i (k-\frac{N}{2})x}\right)^2.\end{eqnarray}\]This is a less oscillating version of the Dirichlet kernels; in particular nonnegative. We will show that $(F_N)$ is a good family of kernels. Since \[\begin{eqnarray}F_N*f(x)=\frac{1}{N+1}\sum_{n=0}^{N}f*D_N(x)=\frac{1}{N+1}\sum_{n=0}^{N}S_n(f,x),\end{eqnarray}\]the previous theorem then finishes this proof.
Since $\int_{-\frac{1}{2}}^{\frac{1}{2}}e^{2\pi i k x}=0$ unless $k=0$, we see that the average of each $D_N$ is $1$, so the average of their averages is $1$, that is, condition (i) holds.
Nonnegativity tells that
\[\begin{eqnarray}\int_{ -\frac{1}{2}}^{\frac{1}{2}}|F_N(x)|dx=\int_{ -\frac{1}{2}}^{\frac{1}{2}}F_N(x)dx=1,\end{eqnarray}\]so (ii) is satisfied.
A short computation similar to what was done for Dirichlet kernels reveals that
\[\begin{eqnarray}F_N(x)=\frac{1}{N+1}\left(\frac{\sin(\pi(N+1)x)}{\sin(\pi x)}\right)^2.\end{eqnarray}\]Therefore, it holds that
\[\begin{eqnarray}\int_{{\varepsilon}<|x|<\frac{1}{2}}|F_N(x)|dx&\leq& \frac{2}{N+1}\int_{\varepsilon}^{\frac{1}{2}}\frac{dx}{\sin^2(\pi x)}\\&\leq& \frac{2}{N+1}\int_{\varepsilon}^{\frac{1}{2}}\frac{dx}{4x^2}\\&=&\frac{1}{2(N+1)}(\varepsilon^{-1}-2)\to 0\end{eqnarray}\]as $N\to \infty$ (we used $\sin(\pi x)\geq 2x$ for $|x|\leq \frac{1}{2}$), so also (iii) holds, meaning that $(F_N)$ is a good family of kernels. ■
Fejér's result has many useful corollaries; in particular, we see that the Fourier coefficients characterize a function almost everywhere.
Corollary 1. Let $f\in L^1([0,1])$. If $S_n(f,x)$ converges pointwise to some limit, the limit must be $f(x)$.
Proof. This is evident since the averages of the Fourier partial sums converge to $f(x)$ in $L^1$, and on the other hand they converge pointwise to the limit of $S_n(f,x)$ if it exists. ■
Corollary 2. The Fourier coefficients determine an $L^1$-function in the following sense: If $f,g\in L^1([0,1])$ and $\hat{f}(n)=\hat{g}(n)$ for all $n$, then $f=g$ almost everywhere.
Proof. The function $h=f-g$ has identically zero Fourier coefficients, so $h*F_N(x)$ converges in $L^1$ norm to $0$, but on the other hand, it converges to $h(x)$ almost everywhere, so $h(x)=0$ for all $x$. ■
Theorem 4 (Weierstrass polynomial approximation). Given any continuous function $f$ on a closed interval $[a,b]$ and $\varepsilon>0$, there exists a trigonometric polynomial $P$ such that $\sup_{x\in [a,b]}|f(x)-P(x)|<\varepsilon.$
Proof. We may assume $[a,b]=[0,1].$ Since the trigonometric polynomials $\frac{1}{N+1}\sum_{n=0}^NS_n(f,\cdot)$ converge uniformly to $f$, we get the claim. ■
Dini's test
Theorem 5 (Riemann-Lebesgue lemma). Let $f\in L^1([0,1])$. Then $\hat{f}(n)\to 0$ as $|n|\to \infty$.
Proof. If $f$ is a trigonometric polynomial, the statement is obvious, as $\hat{f}(n)=0$ for large $n$. Any $f\in L^1([0,1])$, can be approximated uniformly by trigonometric polynomials (using $f*F_N$, for instance) in $L^1$-norm, so the general case follows form the case of trigonometric polynomials. ■
Theorem 6 (Dini's test). Suppose that $f\in L^1([0,1])$ is $1$-periodic, $x\in [0,1]$, and
\[\begin{eqnarray}\int_{0}^{\frac{1}{2}}\left|\frac{f(x+t)+f(x-t)}{2}-f(x)\right|\frac{dt}{t}<\infty.\end{eqnarray}\]Then the Fourier series of $f$ converges to $f(x)$ at the point $x$.
Proof. It suffices to show the following: If
\[\begin{eqnarray}\int_{0}^{\frac{1}{2}}\left|\frac{f(t)+f(-t)}{2}-a\right|\frac{dt}{t}<\infty,\end{eqnarray}\]then $S_N(f,0)\to a$. Indeed, by considering $g(t)=f(x+t)$ and taking $a=f(x)$ we get the general case. In this special case, we can further suppose $a=0$, as otherwise we may consider $g(x)=f(x)-a$. Using the summation formula for sines,
\[\begin{eqnarray}S_n(f,0)&=&\int_{-\frac{1}{2}}^{\frac{1}{2}}\frac{\sin((2N+1)\pi t)}{\sin(\pi t)}f(-t)dt\\&=&\int_{-\frac{1}{2}}^{\frac{1}{2}}\frac{\cos(\pi t)}{\sin(\pi t)}f(-t)\sin(2N\pi t)dt+\int_{-\frac{1}{2}}^{\frac{1}{2}}f(-t)\cos(2N\pi t)dt\end{eqnarray}\]By the Riemann-Lebesgue lemma (taking the real part), the latter integral converges to $0$. The former integral is
\[\begin{eqnarray}\int_{0}^{\frac{1}{2}}(f(t)+f(-t))\frac{\cos(\pi t)}{\sin(\pi t)}\sin(2N\pi t)dt.\end{eqnarray}\]We want to show that this approaches zero, and by the Riemann-Lebesgue lemma, this would follow if the function $(f(t)+f(-t))\frac{\cot(\pi t)}{\sin(\pi t)}$ was absolutely integrable, but this is bounded in absolute value by $\frac{|f(t)+f(-t)|}{2t}$, and our condition is precisely the integarbility of this function over $[0,\frac{1}{2}]$. ■
We will see that Dini's test has numerous useful consequences; we are for example able to prove that the Fourier series of any Hölder continuous function converges to the right value.
Corollary 3. Let $f\in L^1([0,1])$, and assume
\[\begin{eqnarray}\int_{-\frac{1}{2}}^{\frac{1}{2}}\left|\frac{f(x+t)-f(t)}{t}\right|dt<\infty\end{eqnarray}\]then the Fourier series of $f$ converges to $f(x)$ at $x$.
Proof. With this assumption, the integral in Dini's test equals
\[\begin{eqnarray}
\int_{0}^{\frac{1}{2}}\left|\frac{f(x+t)-2f(x)+f(x-t)}{2t}\right|&=&\int_{-\frac{1}{2}}^{\frac{1}{2}}\left|\frac{\frac{f(x+t)-f(x)}{t}-\frac{f(x)-f(x-t)}{t}}{2}\right|\\&\leq& \int_{0}^{\frac{1}{2}}\left|\frac{f(x+t)-f(x)}{t}\right|dt+\int_{0}^{\frac{1}{2}}\left|\frac{f(x)-f(x-t)}{t}\right|dx\\&= &\int_{-\frac{1}{2}}^{\frac{1}{2}}\left|\frac{f(x+t)-f(x)}{t}\right|<\infty.
\end{eqnarray}\]
Corollary 4. Let $f\in L^1([0,1])$ and let $x$ be such that $|f(x+t)-f(x)|\leq C|t|^{\alpha}$ for some $C,\alpha>0$ and all $t$. Then the Fourier series of $f$ converges to $f(x)$ at $x$. In particular, if $f$ is in the Hölder space $C^{\alpha}$ (this means that the condition is satisfied for all $x$, with $C$ independent of $x$), its Fourier series converges to $f$ pointwise everywhere.
Proof. The condition in the previous corollary is satisfied because $\int_{-\frac{1}{2}}^{\frac{1}{2}}|t|^{\alpha-1}dt$ converges. ■
Corollary 5. Assume that $f$ is piecewise $C^1$ and that $f'$ has left and right limits at every point. Then
\[\begin{eqnarray}
S_N(f,x)\to\lim_{t\to 0}\frac{f(x+t)+f(x-t)}{2}
\end{eqnarray}\]as $N\to \infty$ for all $x$.
Proof. At the points that are not discontinuity points, this has already been shown. By assumption, $\frac{f(x+t)-f(x)}{t}$ remains bounded, as well as $\frac{f(x)-f(x-t)}{t}$. Therefore their difference is integrable, so we may apply Dini's test. ■
$L^2$ convergence
Note that the set $(e_n)_{n=-\infty}^{\infty}$ is orthonormal in $L^2$ (that is, $\langle e_m,e_n\rangle=1$ if $m=n$ and equals $0$ otherwise) by the orthogonality of the exponentials. Now, one has the Pythagorean theorem
\[\begin{eqnarray}
\left\|\sum_{n=M}^N a_ne_n\right\|^2=\sum_{n=M}^N |a_n|^2
\end{eqnarray}\]for any complex numbers $a_n$. The name Pythagorean theorem arises from the fact that if $e_1$ and $e_2$ are orthogonal in the Hilbert space $\mathbb{R}^2$, then $|ae_1+a_2e_2|^2=|a_1|^2+|a_2|^2$ by the corresponding theorem from classical geometry. The proof is straightforward using the inner product:
\[\begin{eqnarray}
\left\|\sum_{n=M}^N a_ne_n\right\|^2&=&\left\langle \sum_{n=M}^N a_ne_n, \sum_{n=M}^N a_ne_n\right\rangle\\&=&\sum_{m,n=M}^N a_m \bar{a_n}\langle e_m,e_n\rangle\\&=&\sum_{n=M}^N |a_n|^2.
\end{eqnarray}\]
We start by showing that the Fourier partial sums are the best way to approximate an $L^2$ function by trigonometric polynomials.
Theorem 5. For any complex numbers $b_n$ and natural number $N$,
\[\begin{eqnarray}
\|f-S_N(f)\|\leq \left\|f-\sum_{n=-N}^N b_ne_n\right\|.
\end{eqnarray}\]
Proof. Denote by $a_n=\langle f,e_n\rangle$ the Fourier coefficients. Employing the Pythagorean theorem, we compute
\[\begin{eqnarray}\|f-S_N(f)\|^2&=&\|f\|^2+\left|\sum_ {n=_N}^N a_ne_n\right|^2-2\Re\left(\sum_{n=-N}^{N}\langle f,\langle f,e_n\rangle e_n\rangle\right)\\&=&\|f\|^2+\sum_ {n=_N}^N |a_n|^2-2\Re\left(\sum_{n=-N}^{N}|\langle f,e_n\rangle|^2\right)\\&=&\|f\|^2-\sum_ {n=_N}^N |a_n|^2.\end{eqnarray}\]On the other hand, for any $b_n$, one has
\[\begin{eqnarray}\left\|f-\sum_{n=-N}^N b_ne_n\right\|^2&=&\|f\|^2+\sum_{n=-N}^N |b_n|^2-2\Re\left(\sum_{n=-N}^{N}\langle f,b_ne_n\rangle\right)\\&=&\|f\|^2+\sum_{n=-N}^N |b_n|^2-2\Re\left(\sum_{n=-N}^N a_n \overline{b_n}\right),\end{eqnarray}\]and this is not larger than the first quantity, due to $\sum_{n=-N}^N |a_n-b_n|^2\geq 0$. ■
Theorem 6 (Riesz-Fischer). The Fourier series of a function $f\in L^2([0,1])$ converges to $f$ in the $L^2$ norm. Conversely, if the Fourier series of an integrable function converges in the $L^2$ norm, we have $f\in L^2([0,1])$.
Proof. Consider the sequence $(e_n)$, which turns out to be a basis for $L^2$. The span of these functions trivially contains all trigonometric polynomials, and trigonometric polynomials are dense in $L^2$. This follows from the fact that $f*F_N$ is always a trigonometric polynomial and $(F_N)$ is a good family of kernels. Therefore, for $f\in L^2$, there exists a trigonometric polynomial $P_n$ of degree $n$ satisfying $\|f-P_n\|<\varepsilon(n)$, where $\varepsilon(n)$ is some function tending to $0$. By the previous theorem, this implies $\|f-S_n(f)\|<\varepsilon(n)$, so $S_n(f)$ converges to $f$ in $L^2$.
For the converse, let $f$ be integrable, and let $a_n=\langle f,e_n\rangle$ be its Fourier coefficients. Suppose that $S_n(f)\to f$ in $L^2$. Then $\sum_{n=-\infty}^{\infty}|a_n|^2$ must converge, for otherwise by the continuity of the norm
\[\begin{eqnarray}\left\| \sum_{n=-\infty}^{\infty}a_ne_n \right\|^2&=&\lim_{N\to \infty}\left\|\sum_{n=-N}^{N}a_ne_n\right\|&=&\lim_{N\to \infty}\sum_{n=-N}^N |a_n|^2=\infty,\end{eqnarray}\]which is a contradiction. Now, with the assumption $\sum_{n=-\infty}^{\infty}|a_n|^2<\infty$, we find
\[\begin{eqnarray}\left\|\sum_{n=M}^Na_n e_n\right\|^2&=\sum_{n=M}^{N}|a_n|^2&\leq& \sum_{n=M}^{\infty}|a_n|^2\to 0\end{eqnarray}\]as $M\to \infty$. Hence $A_N:=\sum_{n=0}^{N}a_n e_n$ is a Cauchy sequence, so by the completeness of $L^2$ (which is a well-known fact based on the monotone convergence theorem), it converges to some $g\in L^2$. Similarly, $B_M:=\sum_{n=-M}^{-1}a_n e_n$ is Cauchy; let $h\in L^2$be its limit. Then $\varphi:=g+h$ satisfies, by the continuity of the inner product,
\[\begin{eqnarray}\langle \varphi,e_m\rangle=\lim_{N\to \infty}\sum_{n=-N}^N a_n\langle e_n,e_m \rangle=a_m.\end{eqnarray}\]If we denote $\psi=f-\varphi$, we get $\langle \psi,e_m\rangle=0$ for all $m$. But $(e_m)$ was a basis, so this is impossible unless $\psi=0$, so $f=\varphi$. Therefore $f\in L^2$. ■
We get a few interesting corollaries. Along with the Riesz-Fischer theorem, they show that Fourier series behave essentially in the best imaginable way when it comes to $L^2$ convergence.
Corollary 6 (Parseval's theorem). For $f,g\in L^2([0,1])$, one has
\[\begin{eqnarray}\langle f,g\rangle=\sum_{n=-\infty}^{\infty}\hat{f}(n)\overline{\hat{g}(n)}.
\end{eqnarray}\]
Proof. Let $a_n\hat{f}(n)$, $b_n=\hat {f}(n)$. We write $f$ and $g$ as Fourier series to obtain
\[\begin{eqnarray}\langle f,g \rangle&=&\lim_{N\to \infty}\left\langle \sum_{n=-N}^{N}a_ne_n, \sum_{n=-N}^{N}b_ne_n \right\rangle\\&=&\lim_{N\to \infty}\sum_{-N}^N a_n \overline{b_n}=\sum_{n=-\infty}^{\infty}\hat{f}(n)\overline{\hat{g}(n)}.\end{eqnarray}\]In the previous post, we had already proved a special case of this, but for the case of general $L^2$ functions one needs the theory above.
Corollary 7. Let $f\in L^1([0,1])$. Then $f\in L^2([0,1])$ if and only if $\sum_{n=-\infty}^{\infty}|\hat{f}(n)|^2<\infty.$
This was proved in the course of the previous proof. Interestingly, this actually gives us a bijection between $L^2([0,1])$ and $\ell^2(\mathbb{Z})$, and this is also an isometry by Parseval's theorem. One can show using similar ideas as above that any infinite dimensional separable Hilbert space is isometrically isomorphic to the ''simple'' space $\ell^2(\mathbb{Z})$. Of course, these spaces often have some other structure in addition to the inner product, so this fact does not say that these spaces would not have any properties that could not be found in $\ell^2(\mathbb{Z})$.