We will prove Ramsey's theorem, including its generalizations for multicolored graphs and hypergraphs. As corollaries we will prove Schur's theorem on the solvability of Fermat's equation $x^n+y^n=z^n, xyz\neq 0$ modulo any number, and the Happy Ending problem on finding convex polygons in an arbitrary set of points. We will also derive a lower bound due to Erdös for the Ramsey numbers. The bound is based on Erdös' probabilistic method, which has proved to be a very fruitful tool in combinatorics despite its simplicity. It is quite remarkable that the lower bound from 1947 that we are going to prove is the best one known up to a constant factor. Furthermore, so far all lower bounds that do not appeal to the probabilistic method (in particular constructive bounds) are considerably weaker than the probabilistic bound -- they do not even grow exponentially.

Consider the following simple folklore problem: Among any six people, there are either three that each know each other, or three no two of which know each other. This is already a problem in Ramsey theory, and generalizing it will lead to Ramsey's theorem.

One can solve the folklore problem as follows. Label the people $A_1,....,A_6$ and consider them as points forming a regular hexagon. If two people know each other, we connect them with a red line; otherwise we connect them with a blue line. As $A_1$ is connected to five other people, there exist (by the pigeonhole principle) at least three people such that $A_1$ is connected to all of them with the same color. We may assume that the color is red and that $A_1$ is connected to $A_2,A_3$ and $A_4$ by that color. Consider now the connections between $A_2,A_3$ and $A_4$. If any of them is red, we have three people all knowing each other. If none of them is red, we have three people that do not know each other, so the problem is solved.

It is easy to see that the claim of the problem fails for five people: consider five people in a ring, each knowing their two neighbors and no other people. What we have proved can hence be stated as $R(3,3)=6.$ There is nothing special about $3$ (except that the simple argument worked), and we may formulate a more general statement, known as Ramsey's theorem (Ramsey also proved the generalization of Theorem 1 to multicolored hypergraphs, a result which we will show later).

**Theorem 1 (Ramsey's theorem).**Let $k$ and $\ell$ be given positive integers. There exists a number $N$ such that if the edges of any graph on $N$ vertices are colored red and blue, there is either a red clique of size $k$ (that is, there is a red edge between any two of these $k$ points) or a blue clique of size $\ell$. The smallest such $N$, denoted by $R(k,\ell)$, satisfies

\[\begin{eqnarray}R(k,\ell)\leq \binom{k+\ell-2}{k-1}.\end{eqnarray}\]

**Proof.**We give an elegant proof due to Erdös and Szekeres from 1935. We use induction on $k+\ell$ and show that

\[\begin{eqnarray}R(k,\ell)\leq R(k-1,\ell)+R(k,\ell-1).\quad \quad (1)\end{eqnarray}\]

The numbers on the right-hand side exist by assumption. Clearly $R(k,1)=R(1,k)=1$, so the base case is verified. Suppose all the cases $k+\ell<n$ hold and consider a case where $k+\ell=n.$ Let $G$ be a graph, colored red and blue, on $m=R(k-1,\ell)+R(k,\ell-1)$ vertices, and let $A$ be a vertex of it. Now $A$ is connected to at least $R(k-1,\ell)$ vertices by a red edge or to $R(k,\ell-1)$ vertices by a blue edge (otherwise, there would be in total at most $m-2$ connections). In the first case, we have a subgraph $H$ of $G$ containing either a red clique of size $k-1$ or a blue clique of size $\ell$. When we add $A$ to this graph, we see that $G$ contains the desired structure. The second case is similar, and again we see that $G$ contains the desired structure. From Pascal's triangle, it follows that $(1)$ implies the bound $R(k,\ell)\leq \binom{k+\ell-2}{k-1}.$ ■

An immediate corollary is the following bound for the most interesting Ramsey numbers $R(k,k)$.

**Corollary.**We have

\[\begin{eqnarray}R(k,k)\leq (1+o(1))\frac{4^k}{\sqrt{\pi k}}.\end{eqnarray}\]

**Proof.**Stirling's formula says that $n!\sim n^ne^{-n}\sqrt{2\pi n}$. Inserting this into the definition of $\binom{2n}{n}$, the claim follows. ■

The bound for $R(k,\ell)$ given by Theorem 1 is surprisingly sharp for small values. We have for example

\[\begin{eqnarray}&R(3,4)\leq R(3,3)+R(2,4)=6+4=10,\\&R(3,5)\leq R(3,4)+R(2,5)\leq 10+5=15,\end{eqnarray}\]

and both bounds are only one off from the right value. One can even get the precise values $9$ and $14$ by making the following observation in the proof of Theorem 1: We actually have

\[\begin{eqnarray}R(k,\ell)\leq R(k,\ell-1)+R(k-1,\ell)-1\end{eqnarray}\]

when $a=R(k,\ell-1)$ and $b=R(k-1,\ell)$ are both even. Indeed, if $G$ is a graph on $a+b-1$ vertices and $A$ is a vertex of it, $A$ either has at lest $a$ red neighbors or at least $b$ blue neighbors, or the (red) degree of $A$ is $a-1$ or $b-1$, and hence odd. In the first case, the proof goes as before, so we may assume that every vertex $A$ has an odd (red) degree. As $a+b-1$ is odd, this means that the sum of the degrees is odd. However, in any graph the sum of the degrees is even, as each edge is counted twice. This proves the improved version of $(1)$.

Now that we have proved that any large finite graph contains a clique or an anticlique of a given size, we may ask what can be said about infinite graphs. Trivially, Ramsey's theorem tells that an infinite graph contains a clique or an anticlique of fixed size. However, we can even say that it contains an infinite substructure.

**Theorem 2 (infinite Ramsey's theorem).**Let $G$ be an infinite graph whose edges are colored red and blue. Then $G$ contains an infinite monochromatic clique.

**Proof.**We use a greedy algorithm. A similar idea could be used to prove the finiteness of $R(k,\ell)$, but would not produce a bound quite as good as in Theorem 1. Let $A$ be a vertex of $G$. $A$ has infinitely many neighbors, so by the pigeonhole principle there are either infinitely vertices to which $A$ is connected by a red edge, or infinitely many to which $A$ is connected by a red edge. Therefore, we obtain a subgraph $G_1$ not containing $A$ such that $A$ is connected to each vertex in $G_1$ by the same color. Now pick a vertex $A_1$ of $G_1$. It has infinitely many neighbors in $G_1$, so there is a subgraph $G_2$ of $G_1$ not containing $A_1$ such that $A_1$ is connected to each vertex in $G_2$ by the same color. Continuing like this, we get a sequence $G,G_1,G_2,...$ of infinite graphs, each being a subgraph of the previous one, and we get a sequence $A,A_1,A_2,...$ of vertices from them. Each graph $G_i$ is monochromatic, so there is an infinite subsequence $G_{i_1},G_{i_2},...$ such that each of these graphs is red (without loss of generality). Now the vertices $A_{i_1},A_{i_2},...$ of $G$ form an infinite red clique, since by definition $A_{i_j}$ is connected to each of $A_{i_{j+1}},A_{i_{j+2}},...$ by a red edge. ■

**Multicolor Ramsey numbers**
The next step in generalizing Ramsey's theorem is adding more colors. Now we color the edges of our graph with $r$ colors and seek for a monochromatic clique. This generalization can be stated and proved basically as the finiteness of $R(k,\ell)$.

**Theorem 3 (Multicolor Ramsey's theorem).**Let $r\geq 2$, and let $k_1,...,k_r$ be given positive integers. There exists a number $N$ such that if any graph $G$ on $N$ vertices is colored with $r$ colors called $c_1,...,c_r$, there exists $i$ such that $G$ contains a clique of color $c_i$ and size $k_i$.

**Proof.**Suppose that $r$ is the smallest number of colors for which the statement has not yet been proven (for $r=2$ it is true). We show that

\[\begin{eqnarray}R(k_1,...,k_r)\leq \sum_{i=1}^r R(k_1,...,k_i-1,...,k_r)\quad \quad (2)\end{eqnarray}\]

by induction on $k_1+...+k_r.$ We have $R(k_1,...,k_{r-1},1)=R(k_1,...,k_{r-1})$, so the cases $k_i=1$ have been dealt with. Assume that the claim holds whenever $k_1+...+k_r<m$ and consider the case $k_1+...+k_r=m$. The numbers on the right-hand side of $(2)$ exist by the induction assumption. Let their sum be $S$, and let $G$ be a graph on $S$ vertices. By the pigeonhole principle, for any vertex $A$ of $G$, there exists $i$ such that $A$ has at least $R(k_1,...,k_i-1,...,k_r)$ neighbors that are connected to it with color $c_i$. These neighbors give us a subgraph $H$, and adding $A$ to it, this subgraph contains a clique of color $c_i$ and size $k_i$ for some $i$. Therefore the claim holds also for $r$ colors. ■

The proof above gives an upper bound for $R(k_1,...k_r)$ in terms of the multinomial coefficients

\[\begin{eqnarray}\binom{n}{a_1,a_2,...,a_r}:=\frac{n!}{a_1!....a_r!},\end{eqnarray}\]

which are defined for nonnegative integers $n,a_1,...,a_r$ satisfying $n=a_1+...+a_r.$ It is easy to see that this multinomial coefficient, which we denote shortly as $M(n;a_1,...,a_r)$, counts the number of ways one can place $n$ different objects into $r$ bins in such a way that the $i$th bin contains $a_i$ objects (the ordinary binomial coefficients give the answer to this counting problem when $r=2$). We have the recursion

\[\begin{eqnarray}M(n;a_1,...,a_r)=\sum_{i=1}^r M(n-1;a_1,...,a_i-1,...,a_r),\end{eqnarray}\]

since if we fix one of the $n$ objects that are to be placed in the bins, the right-hand side gives the number of ways for placing the objects. Now we get the following bound.

**Corollary.**We have, for $k_1,...,k_r\geq 2,$

\[\begin{eqnarray}R(k_1,...,k_r)\leq \binom{k_1+...+k_r-2r+2}{k_1-1,k_2-1,k_3-2,...,k_r-2}\quad \quad (3).\end{eqnarray}\]

**Proof.**We use induction on $r$. The case $r=2$ is contained in Theorem 1. Denote by $f(k_1,...,k_r)$ the function on the right-hand side of $(3)$. Then $f$ has the property

\[\begin{eqnarray}f(k_1,...,k_r)=\sum_{i=1}^r f(k_1,...,k_i-1,....,k_r)\end{eqnarray}\]

as the multinomial coefficients have this property. By the induction hypothesis,

\[\begin{eqnarray}R(k_1,...,k_{r-1},2)\leq \binom{k_1+...+k_r-2(r-1)+2}{k_1-1,k_2-1,...,k_r-1-2,0}=f(k_1,...,k_{r-1},2),\end{eqnarray}\]

and similarly we see that $R(k_1,..,k_r)\leq f(k_1,..,k_r)$ whenever $k_i=2$ for some $i$. Now the initial values of $R(k_1,...,k_r)$ are bounded by the initial values of $f(k-1,..,k_r)$ and the upper bound for $R(k_1,..,k_r)$ satisfies the same recursion as $f(k_1,..,k_r)$. Hence the claim follows.■

In particular, for the numbers $R(3,3,...,3)$ ($k$ variables), which we denote as $R(3;k)$, we have

\[\begin{eqnarray}R(3;k)\leq \binom{k+2}{2,2,1,...,1},\end{eqnarray}\]

where $1$ occurs $k-2$ times. In other words,

\[\begin{eqnarray}R(3;k)\leq \frac{(k+2)!}{4}.\end{eqnarray}\]

The finiteness of $R(3;k)$ has an interesting application, seemingly unrelated to graphs, which we now present.

**Theorem 4 (Schur's theorem).**Let $k$ be a positive integer. There exists an integer $N$ such that whenever the numbers in $\{1,2,...,N\}$ are colored with $k$ colors, one can find distinct $x,y$ and $z$ from this set, each of the same color, satisfying $x+y=z$. The smallest such $N$ is the Schur number $S(k)$ and satisfies

\[\begin{eqnarray}S(k)\leq \frac{(k+2)!}{4}.\end{eqnarray}\]

**Proof.**Let $N=\frac{(k+2)!}{4}$, and let the integers in $\{1,2,...,N\}$ be colored with $k$ colors $c_1,...,c_k$. Form a graph from these integers in such a way that $i$ is connected to $j$ with the same color with which $|i-j|$ is colored. Now we have a graph of size at least $R(3;k)$ colored with $k$ colors, so there exists a monochromatic triangle. Let its vertices be $x,y$ and $z$ with $x<y<z$. Now $z-x,z-y$ and $y-x$ all have the same color, and $z-x$ is the sum of the other two numbers, as wanted. ■

Schur's theorem is in itself an interesting result on arithmetic Ramsey theory and somewhat similar to the existence of a monochromatic three-term arithmetic progression in any $k$-coloring of the set of positive integers (a special case of van der Waerden's theorem). However, Schur's theorem also has a curious application to Fermat's equation $x^n+y^n+z^n$, to be solved in integers with $xyz\neq 0$ and $n\geq 3$ (of course, Fermat's last theorem, proved by Andrew Wiles in 1995, says that there are no solutions). This was actually Schur's original motivation for his theorem. Perhaps one of the easiest possible ways to prove the insolvability of Fermat's equation for a specific exponent $n$ would be to find infinitely many positive integers $m$ such that $x^n+y^n=z^n\pmod m$ implies $xyz\equiv 0\pmod m$, because then we would have $xyz=0.$ Could such an approach work? Probably not, as no such sequence of moduli has been constructed, but it is actually a theorem of Schur that this simple approach is doomed to fail. If such an infinite sequence of moduli would exists, we could take them to be prime powers, but Theorem 5 gives a contradiction in this case.

**Theorem 5 (Schur).**Let $n\geq 3$ be a positive integer and $p^{\alpha}$ a prime power. Suppose that all the solutions to the congruence

\[\begin{eqnarray}x^n+y^n\equiv z^n \pmod{p^{\alpha}}\end{eqnarray}\]

satisfy $xyz\equiv 0\pmod{p^{\alpha}}.$ Then $p<S(n)$ and $\alpha\leq n.$

**Proof.**First assume $\alpha=1.$ Let $p\geq S(n)>n$, and let $g$ be a primitive root $\pmod p$ (see this post for the existence of primitive roots). We partition the nonzero congruence classes modulo $p$ into $n$ sets:

\[\begin{eqnarray}&&\{1,g^n,g^{2n},...\}\\&&\{g,g^{n+1},g^{2n+1},...\}\\&&...\\&&\{g^{n-1},g^{2n-1},g^{3n-1},...\}\end{eqnarray}\]

These sets do not intersect, because $g^a\equiv g^b\pmod p$ implies $a\equiv b\pmod p-1$.

Now if $p\geq S(n)$, Theorem 4 implies that

\[\begin{eqnarray}g^{an+j}+g^{bn+j}\equiv g^{cn+j} \pmod p\end{eqnarray}\]

for some $j\in \{1,2,..,n\}$ and integers $a,b$ and $c$. Now $x^n+y^n\equiv z^n\pmod p,$ where $x=g^a,y=g^b,z=g^c$ and $xyz\neq 0\pmod p.$

Now let $\alpha>1$ and $p\nmid n$. We use induction on $\alpha$ (and perform Hensel's lifting). Choose $x,y$ and $z$ so that $x^n+y^n\equiv z^n\pmod{p^{\alpha-1}},$ $xyz\neq 0\pmod p$. Notice that

\[\begin{eqnarray}(x+p^{\alpha-1}m)^n\equiv x^n+nx^{n-1}\cdot p^{\alpha-1}m\pmod p^{\alpha},\end{eqnarray}\]

and $nx^{n-1}m$ goes through all the residue classes $\pmod p$ when $m$ does, because $p\nmid nx^{n-1}$. This means that there exists $m$ such that $x'=x+p^{\alpha-1}m,y$ and $z$ satisfy $(x')^{n}+y^n\equiv z^n\pmod p,$ $x'yz\neq 0\pmod p$.

Finally, let $\alpha>n,p\mid n.$ Then we may choose $x=p^{\lceil\frac{\alpha}{n}\rceil},y=z=1$. Now trivially $x^n+y^n\equiv z^n\pmod{p^{\alpha}},$ but $p^{\alpha}\nmid xyz$.■

**Hypergraph Ramsey numbers**
The multicolor Ramsey theorem, despite being quite general, still has a vast generalization to hypergraphs. A collection $G$ of vertices forms a $t$-hypegraph ($t\geq 2$) if for any set of $t$ vertices it is determined whether they are connected to each other or not. A collection of $t$ vertices is a hyperedge if they are all connected. In other words, a $t$-hypergraph consists of a set of vertices $v_1,v_2,...$ and some hyperedges, which are $t$ element subsets of $\{v_1,v_2,...\}$. We can now color the hyperedges of a $t$-hypegraph $G$ with $r$ colors $c_1,...,c_r$, and ask whether there exists $i$ such that there is a $t$-hyperclique of color $c_i$ and size $k_i$. A $t$-hyperclique of size $k$ is a collection of $k$ vertices $(k\geq t)$ such that every $t$ of them form a hyperedge. The finiteness of the hypergraph Ramsey numbers was already proved by Ramsey in his 1930 paper.

**Theorem 6 (Hypergraph Ramsey's theorem).**Let $k_1,..,k_r$ and $t\geq 2$ be positive integers ($r\geq 2,t\leq \max\{k_1,...,k_r\}$). Then there exists $N$ such that if the hyperedges of any $t$-hypegraph on $N$ vertices are colored with colors $c_1,...,c_r$, there exists $i$ such that $G$ contains a $t$-hyperclique of color $c_i$ and size $k_i$. The smallest such $N$ is denoted by $R_t(k_1,...,k_r)$.

**Proof.**Also this proof is from the seminal 1935 paper of Erdös and Szekeres,

*A combinatorial problem in geometry*, where their main aim was proving the Happy Ending problem.

Suppose that the claim has already been proved for $t-1$-hypergraphs. For $t=2$ this is the case. Suppose also that the claim has been proved for $t$-hypergraphs that are colored with less than $r$ colors. For $r=2$ this is known (we could of course induct on $t$ and $r$, but having a large number of simultaneous inductions could be messy). We use induction on $k_1+...+k_r$ to establish the finiteness of $R_t(k_1,...,k_r)$. If $k_r=t$, we have $R_t(k_1,...,k_r)=R_t(k_1,...,k_{r-1})$ (because any $t$ vertices form a $t$-hyperedge of size $t$). The latter number is known to be finite by the case of $r-1$ colors.

We show that

\[\begin{eqnarray}&&R_t(k_1,...,k_r)\\&\leq& \sum_{i=1}^r R_{t-1}(R_t(k_1,...,k_i-1,...,k_r),...,R_t(k_1,...k_i-1,...,k_r))+1,\quad (4)\end{eqnarray}\]

where the number of colors is $r$ in each of the Ramsey numbers. The right-hand side is known to be finite by the induction assumption and the Ramsey theorem for $t-1$-hypergaphs. Let $N$ be the right-hand side of $(4)$, and consider a $t$-hypergraph $G$ on $N$ vertices, whose hyperedges are colored with colors $c_1,...,c_r$. Fix a vertex $A$ of $G$. We define the $t-1$-hypergraphs $H_i$, saying that a set $\{A_1,...,A_{t-1}\}$ of vertices of $G\setminus{A}$ is a hyperedge if $\{A,A_1,...,A_{t-1}\}$ is a hyperedge of color $c_i$ in the original hypergraph $G$. By the pigeonhole principle, there exists $j$ such that $H_j$ has at least $M$ vertices, where

\[\begin{eqnarray}M=R_{t-1}(R_t(k_1,...,k_j-1,...,k_r),...,R_t(k_1,...,k_j-1,...,k_r)).\end{eqnarray}\]

By definition, this means that for some $i$ the $t-1$-hypergraph $H_j$ contains a $t-1$-hyperclique of color $c_i$ and size $R_t(k_1,...,k_j-1,...,k_r)$. Let such a clique be called $K$. The vertices of $K$ define a $t$-hypegraph of size $R_t(k_1,...,k_j-1,...,k_r)$ when $K$ is viewed as a subset of $G$. By definition, $K$ contains either a $t$-hyperclique of size $k_j-1$ and color $c_j$, or a $t$-hyperclique of size $k_s$ and color $c_s$ for some $s\neq j$. In the latter case, we are done, as $K$ is a subset of $G$. In the former case, $K\cup \{A\}$ contains a $t$-clique of size $k_j$ and color $c_j$, because by assumption $A$ forms only hypercliques of color $c_j$ when combined with any $t-1$ vertices from $K$. ■

The bound obtained for the hypergraph Ramsey numbers is incredibly huge. We will not even try to write down a function that is an upper bound for these numbers, but rather show that such a function grows faster than the Ackermann numbers, which is a sequence growing with enormous speed. We define the Ackermann numbers by

\[\begin{eqnarray}a(1)&=&1+1=2,\\a(2)&=&2\cdot 2=4,\\a(3)&=&3^3=27,\\a(4)&=&4^{4^{4^4}}\\&...&\end{eqnarray}\]

The $n$-th number is obtained by applying the ''$n$-th operation'', called $f_n$, $n$ times, where the first operation $f_1$ is addition and the next operation is always the iteration of the previous one. In other words, $f_t$ is the $t$-th operation in the list addition, multiplication, exponentiation,.... We now have the following.

**Theorem 7.**Let $N_t(k)$ be the upper bound given by the proof of Theorem 6 to $R_t(k,k)$. Then $N_t(k)\geq f_t(k)$ for $k\geq 3t$. In particular, $N_t(3t)\geq a(t)$.

**Proof.**For $t=2,$ this is trivial. We use induction on $t$. We have

\[\begin{eqnarray}N_t(k)&\geq& N_{t-1}(N_t(k-1))\\&\geq& N_{t-1}^2(N_t(k-2))\\&\geq&...\\&\geq& N_{t-1}^{t+1}(k-t-1)\\&\geq& N_{t-1}^{t}(k)\\&\geq& N_{t-1}^{t-1}(f_{t-1}(k))\\&\geq& N_{t-1}^{t-2}(f_{t-1}^2(k))\\&\geq&...\\&\geq& f_{t-1}^{t}(k)=f_t(k),\end{eqnarray}\]

where $g^t$ denotes the $t$-th interate of a function and we used the bound $N_{t-1}(s)\geq R(s,s)\geq 2s$ for $t,s\geq 3$ on the fourth row.

Although the hypergraph Ramsey numbers are a bit abstract and grow faster than any reasonable function, they can be applied to solve the so-called Happy Ending problem, which states that any large enough set of non-collinear points in the plane contains a convex polygon on $k$ vertices. The problem was posed by Eshter Klein, who showed that any five non-collinear points contain a convex quadrilateral. The general case was proved by Erdös and Szekeres in 1935. The problem was given its funny name by Erdös, as it led to the marriage of Klein and Szekeres.

**Theorem 8 (Erdös-Szekeres).**Let $k$ be a given positive integer. There exists a number $N$ such that among any $N$ non-collinear points in the plane there are $k$ forming a convex polygon. The smallest such $N$ is denoted by $N(k)$ and satisfies

\[\begin{eqnarray}N(k)\leq R_3(k,k),\end{eqnarray}\]

where $4$ occurs $k$ times.

**Proof.**The proof is based on the following observation. If any $4$ points from the set $\{A_1,...,A_k\}$ form a convex quadrilateral, then $A_1A_2...A_k$ is a convex polygon. We prove this observation by induction on $k$, the case $k=4$ being trivial. Consider any $k+1$ points $A_1,...,A_{k+1}$, any $4$ of which form a convex polygon. Let these points be in clockwise order with respect to the origin. If $A_1A_2...A_{k+1}$ is not convex, then it has an angle greater than $\pi$ in measure. Say that $\angle A_1A_2A_3$ is such an angle. Now $A_1A_2A_3A_4$ is a non-convex quadrilateral, a contradiction.

Let $P_1,...,P_N$ be $N=R_3(k,k)$ non-collinear points in the plane. We form the $3$-hypergraph where $\{P_i,P_j,P_k\}$ is colored red if $i<j<k$ and $P_iP_jP_k$ is oriented counter-clockwise,, and $\{P_i,P_j,P_k\}$ is colored blue if $i<j<k$ and $P_iP_jP_k$ is oriented clockwise. By the definition of $N$, there exist $k$ points such that any triple among them is red, say. Let these points be $P_{i_1},...,P_{i_k}$. For any $a,b,c,d$ with $1\leq a<b<c<d\leq k$ we see that $P_{i_a}P_{i_b}P_{i_c}P_{i_d}$ is convex (because a non-convex quadrilateral contains both clockwise and counter-clockwise triangles). Hence $P_{i_1}...P_{i_k}$ is also convex. ■

The bound $N(k)\leq R_3(k,k)$ is very poor, and it will be improved to

\[\begin{eqnarray}N(k)\leq \binom{2k-4}{k-2}+1\end{eqnarray}\]

in a later post.

**The probabilistic method and a lower bound for Ramsey numbers**
So far we have mainly dealt with upper bounds, and our proofs based on recursive inductions. For lower bounds we must adopt a different strategy, known as the probabilistic method. This problem was Erdös' original application of the method, but many more applications have been found later. The principle states simply that if an event has probability $p<1$ of occurring, there is a case in which it does not occur. Usually applying the method involves defining some random combinatorial configuration and estimating the probability that it contains the desired structure. This is also the case here.

**Theorem 11 (Erdös).**Let $k\geq 2.$ We have $R(k,k)\geq (1-o(1))\frac{1}{\sqrt{2}e}\cdot k\sqrt{2}^{k}$.

**Proof.**Let $G$ be a graph with $m=\lfloor\frac{1-\varepsilon}{\sqrt{2}e}\cdot k\sqrt{2}^{k}\rfloor$ vertices, where $\varepsilon>0$ is arbitrary. Color its edges red and blue by flipping a fair coin to determine the colors. There are $\binom{m}{k}$ cliques of size $k$, and each of these cliques is monochromatic with probability $2^{1-\binom{k}{2}}$. Hence the probability that there is a monochromatic clique of size $k$ is, by the union bound, at most

\[\begin{eqnarray}\binom{m}{k}2^{1-\binom{k}{2}}:=p.\end{eqnarray}\]

If $p<1$, we are done. We have

\[\begin{eqnarray}p&\leq& \frac{m^k}{k!}2^{1-\frac{k(k-1)}{2}}\\&\leq& (1+o(1))\left(\frac{m}{k}\right)^k\cdot e^k\cdot\frac{1}{\sqrt{2\pi k}}\cdot 2^{1-\frac{k(k-1)}{2}}.\end{eqnarray}\]

This is smaller than $1$ if

\[\begin{eqnarray}\frac{m}{k}\cdot (1+o(1))\cdot e\cdot 2^{-\frac{k-1}{2}}<1,\end{eqnarray}\]

and for large enough $k$ this is satisfied when

\[\begin{eqnarray}m<(1-\varepsilon)\cdot\frac{1}{\sqrt{2}e}\cdot k\sqrt{2}^k,\end{eqnarray}\]

which was to be proved. ■

Actually, one can show that the bound of Theorem 11 holds without the factor $1-o(1)$ by estimating $p$ more carefully.

We have now proved the estimates

\[\begin{eqnarray}(1-o(1))k\cdot\frac{\sqrt{2}^k}{\sqrt{2}e}\leq R(k,k)\leq (1+o(1))\frac{4}{\sqrt{\pi k}}\end{eqnarray}\]

for the most interesting Ramsey numbers $R(k,k)$. It may seem completely irrelevant to keep track of constant factors such as $\frac{1}{\sqrt{2}e}$ and $\frac{1}{\sqrt{\pi}}$ or even polynomial factors such as $k$ in the estimates, because the upper and lower bound grow like exponential functions with different bases. However, the bounds above are nearly the best ones known, despite being over 60 years old and based on simple reasoning. Indeed, the lower bound has only been improved by a factor of $2$; this was done by Spencer in 1975. The upper bound was not even improved by an arbitrarily large constant factor until a 1987 paper of Graham and Rödl, but currently the best known result is due to Conlon from 2009 and states that

\[\begin{eqnarray}R(k,k)\leq k^{-c\frac{\log k}{\log \log k}}\cdot 4^k\end{eqnarray}\]

for some $c>0$, which improves the classical bound by more than any polynomial factor.

Finding the correct base for the exponential growth of $R(k,k)$, that is finding

\[\begin{eqnarray}\lim_{k\to \infty}R(k,k)^{\frac{1}{k}},\end{eqnarray}\]

is a major open problem in Ramsey theory. It has not even been proved that the limit exists or that the liminf is strictly greater than $\sqrt{2}$ or that the limsup is strictly less than $4$. There is not even a general agreement of what the limit should be. Numerical evidence is not very helpful for finding the limit, as only very few Ramsey numbers $R(k,k)$ are known; in fact, for $k\geq 5$, they are unknown, and according to Erdös, determining $R(6,6)$ is beyond what could be achieved with all the computers in the world (though at Erdös' time computers were far less efficient).