Evolutionary Algorithms in Combinatorics

Eric Tressler

tressler@gmail.com

What is combinatorics?

Combinatorics is a branch of mathematics concerning the study of (usually finite) discrete structures.

Some subfields:

Graph theory
Finite geometry — the study of finite sets of points
Enumerative combinatorics — counting the number of objects in a group (how many bridge hands have no face cards?)
Probabilistic combinatorics — the study of random discrete objects (a shuffled deck of cards)
Combinatorial optimization — finding the "best" discrete object in a group (the shortest route through 20 cities)
Extremal combinatorics

What is an evolutionary algorithm?

Here, an evolutionary algorithm will be any algorithm that starts with an attempted solution to a problem, and iteratively searches for a better one.

The traveling salesman problem

Two things are necessary to use an evolutionary algorithm to solve a problem:

A way to get from one potential solution to other "nearby" ones
A quantitative measure of how good a solution is (its fitness)

Both of these operations should be fast.

Example - Ant colony optimization

Common uses for evolutionary algorithms

Evolutionary algorithms are used every day for approximating solutions to NP-hard problems:

Travelling salesman
Bin-packing
Scheduling
Route optimization

Each of the above is important to, for instance, a package delivery service. None of them can be solved exactly, and they are all combinatorial.

Evolutionary algorithms as a tool for research

Evolutionary algorithms are also used to attack fundamental problems in mathematics:

Searching for an object that may or may not exist within a large collection
Finding objects that tell us something about a combinatorial problem
Providing direction when trying to prove a theorem in mathematics

Ramsey theory - definitions

A simple graph \(G = (V,E)\) is a set \(V\) of vertices and \(E\) of unordered pairs of distinct elements of \(V\). If \(v,w \in V\) and \(\{v,w\} \in E\), we write \(v \sim w\) and say that \(v\) and \(w\) are adjacent.

The complete graph \(K_n\) is the graph with \(n\) vertices and all possible edges. The graphs \(K_n\) for \(n = 2,3,4,5\) vertices are below:

For a graph \(G = (V,E)\), a 2-coloring of \(G\) is a partition of its edges into two sets, often called "red" and "blue".

Ramsey theory - definitions

Given graphs \(G = (V,E)\) and \(G' = (V',E')\), we say that \(G'\) is a subgraph of \(G\) if \(V' \subseteq V\) and \(E' \subseteq E\). If \(E' = E \cap (V' \times V')\) (we have not discarded additional edges), then \(G'\) is an induced subgraph of \(G\).

A graph \(G\)

Some induced subgraphs of \(G\)

Ramsey's theorem

Theorem (F.P. Ramsey, 1930): For every \(n \in \mathbb{N}\), there is a least integer \(R(n)\) such that if the edges of \(K_{R(n)}\) are 2-colored, there is a monochromatic \(K_n\) as a subgraph.

For example, \(R(3) > 5\); we have seen a 2-coloring of \(K_5\) with no monochromatic \(K_3\):

\(R(3) = 6\)

Unlike \(K_5\), if we choose any 2-coloring of \(K_6\), we cannot avoid a monochromatic \(K_3\) (triangle). To see this, choose a 2-coloring and let \(v\) be any vertex. \(v\) has degree 5, so some 3 of its edges must be the same color (say, red).

\(\longrightarrow\)

If any of dotted lines above is red, then it will form a red triangle with the edges we've drawn. If all 3 are blue, then we have a blue triangle. Therefore, a monochromatic triangle is unavoidable, and \(R(3) = 6\).

The only other known value is \(R(4) = 18\).

\(R(5)\)

\(R(5)\) is currently unknown; the best current bounds for \(R(n)\) (due to Spencer and Conlon) are \[ (1+o(1))\frac{\sqrt{2}n}{e}\sqrt{2}^n \leq R(n) \leq n^{-\frac{c \log n}{\log \log n}} 4^n. \]

Numerically, it is known that \(43 \leq R(5) \leq 49\). There are \(\binom{43}{2} = 903\) edges in \(K_{43}\), and so \(2^{903}\) possible 2-colorings. If \(R(5) = 43\), somehow we have to show that every one of those 2-colorings contains a monochromatic \(K_5\).

Evolutionary algorithms play an important role in these types of problems.

\(R(5)\)

"A fairly complex genetic algorithm was used. Since then I've found very simple methods that can produce the same construction, with the aid of much faster computers." — Geoff Exoo

This is Exoo describing how he was able to show that \(R(5) > 42\) in the first place, and is a situation that has repeated itself many times. To show that \(R(5) > 42\), we just have to find a 2-coloring of \(K_{42}\) with no monochromatic \(K_5\).

Using an evolutionary algorithm, start with a random 2-coloring of \(K_{42}\), and search. The number of monochromatic \(K_5\)s is the fitness of a coloring, and changing the color of one or more edges moves through the space.

We make random changes, and accept the ones that produce more fit colorings. If we find a coloring with fitness \(0\), we stop.

When a good coloring was found, the coloring itself was Exoo's proof that \(R(5) > 42\), leading to a very short paper.

Avoiding arithmetic progressions

An arithmetic progression (AP) of length 3 is a set of the form \(\{a,a+b,a+2b\}, b \neq 0\), with generalizations to longer APs.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Let \([n] := \{1,...,n\}\).

Theorem (Van der Waerden, 1927): For any \(k\), there exists a \(w(k)\) such that if the interval \([w(k)]\) is 2-colored, it contains a monochromatic \(k\)-term AP.

Today we know that \[ w(k) \leq 2^{2^{2^{2^{2^{k+9}}}}},\] due to Gowers in 2001; in fact, \(w(3) = 9\).

Avoiding arithmetic progressions

A related question: for a given \(k\) and a 2-coloring of \([n]\), how few monochromatic k-APs are possible?

Here we are interested in large \(n\), so that there must be many monochromatic k-APs, but the goal is to minimize this number.

For this problem, essentially all of the work has been aided by computer. Naturally, this question was first asked for the case \(k=3\).

This question is ideal to attack with evolutionary algorithms:

it is easy to measure the fitness of a coloring, by counting the number of monochromatic 3-APs.
it is easy to generate many nearby neighbors, by changing the color of one or more of the numbers.

Avoiding arithmetic progressions

The basic algorithm is simple:

[Initialize.] Set \(\chi\) \(\leftarrow\) a random 2-coloring of \([n]\).
[How fit is \(\chi\)?] Set \(F_\chi\) \(\leftarrow\) the number of monochromatic 3-APs in \(\chi\).
[Generate a neighbor] Set \(R\) \(\leftarrow\) a random number in \([n]\). Set \(\psi\) \(\leftarrow\) \(\chi\), with its color in position \(R\) changed.
[How fit is the neighbor?] Set \(F_\psi\) \(\leftarrow\) the number of monochromatic 3-APs in \(\psi\).
[Accept if appropriate] If \(F_\psi \lt F_\chi\), set \(\chi \leftarrow \psi\) and set \(F_\chi \leftarrow F_\psi\).
[Repeat] Go to step 3.

Slightly more sophisticated variations of this algorithm have different notions of "appropriate" in step 5; it is often best to occasionally accept the new coloring even if it is equally good, or even worse. This is sometimes known as a "hill-climbing" algorithm.

Outcome of the algorithm

Remarkably, a consistent pattern emerges when we run the above algorithm for any reasonably large \(n\); here, we will take \(n = 800\), for presentation.

The final, stable coloring produced is one of

or, much more rarely,

Of course, red and blue may be interchanged. The second coloring has more monochromatic 3-APs than the first, and represents a local minimum.

The tendency to become trapped in local minima is the reason for sometimes accepting worse colorings in Algorithm \(A\).

The algorithm in practice

For reference, the most common coloring in the case of 3-term APs is

Results and questions

For the 12-block coloring that avoids 3-term APs, the correct positions of the boundaries between blocks have been determined, by considering the continuous case.

It is known that among symmetrical block colorings with at most 24 blocks, the coloring above is best for 3-term APs.

For 4-APs, the best known coloring is to color a number red if its last nonzero digit in base 11 is 1, 3, 4, 5, or 9, and otherwise color it blue (Lu, Peng).

Questions:

Why do 3-term and 4-term APs give such different results? For block colorings, we have linear algebra and calculus; in the latter case, we have number theory; none of these seem adequate here.
Are either of the two best known colorings actually best?
Do either of these two behaviors come up again if we consider longer APs?
Is there a larger context in which these two behaviors make sense?

Constellations

There is an obvious generalization of APs, which we call constellations. Instead of considering sets of the form \[ \{a, a+b, a+2b\}, \] we would like to consider sets of the form \[ \{a, a+kb, a+\ell b \}. \] We denote this type by \([k,\ell-k]\); this gives the relative differences between the first and second elements, and the second and third elements. This generalizes to 4-term APs.

\(\{a, a+b, a+2b\} \longleftrightarrow [1,1] \longleftrightarrow \bullet\,\overbrace{\ldots\bullet}^b\,\overbrace{\ldots\bullet}^b\)

\(\{a, a+b, a+4b\} \longleftrightarrow [1,3] \longleftrightarrow \bullet\,\overbrace{\ldots\bullet}^b\,\overbrace{\ldots\ldots\ldots\bullet}^{3b}\)

\(\{a, a+2b, a+3b, a+5b\} \longleftrightarrow [2,1,2] \longleftrightarrow \bullet\,\overbrace{\ldots\ldots\bullet}^{2b}\,\overbrace{\ldots\bullet}^b\,\overbrace{\ldots\ldots\bullet}^{2b}\)

Algorithm \(A\) for constellations

Observations about constellations

Here are some colorings obtained with algorithm \(A\).

\([1,2]:\,\,\)

\([1,3]:\,\,\)

\([1,4]:\,\,\)

\([1,5]:\,\,\)

Some of the blocks can be attributed to noise, and to our small choice of \(n = 800\), but there is a clear pattern.

\([3,5]:\,\,\)

These striped patterns on the sides of solid blocks occur for many choices of \([k,\ell]\) where \(k > 1\), though it is not entirely clear why.

Continuization

One further generalization of the constellation problem is to consider the continuization. For 3-term APs in \([n]\), let \(S = \{(a,b) : a \in \mathbb{N} \mbox{ and } a+2b \leq n\}\) and let \(\chi : [n] \to \{0,1\}\). We have been minimizing \[ \sum_{(a,b) \in S} \mathbb{1}_{\chi(a) = \chi(a+b) \mbox{ and } \chi(a) = \chi(a+2b)}. \] We can also consider 2-colorings of the continuous interval \(I = [0,1]\) and attempt to minimize \[ \int_I \int_0^{1-x} \mathbb{1}_{\chi(x)=\chi(x+y/2) \mbox{ and } \chi(x) = \chi(x+y)} dy dx. \] This generalizes to constellations of \(n\) terms. Since we are not constrained to integers, this also allows us to consider constellations such as \([1,\pi]\). There may be something to learn by observing the behavior of this integral for the entire range between types \([1,1]\) and \([1,2]\), for instance.

Continuization continued

Another hope for the continuous viewpoint is that it may explain the striped patterns seen above in the constellation \([3,5]\); in the continuous case, an "AP of length 3" is a set of the form \[\left\{x, x+\frac{1}{2} y, x+y\right\},\] while a constellation of type \([3,5]\) is a set of the form \[\left\{x, x+\frac{3}{8} y, x+y\right\}.\]

These seem less fundamentally different than they are in the discrete case. Maybe the stripes have a natural explanation in this setting, or else the stripes may have been an artifact of the discrete problem for finite \(n\).

Conclusion

There are many problems in combinatorics that we do not have the tools to attack directly. Evolutionary algorithms can give us a first view of the landscape when we're confronted with a new problem.

It is unlikely that we could have discovered the block coloring for 3-term APs without these algorithms, and we would not know that \(R(5) > 42\); in fact, many values for a more general version of \(R(\cdot)\) are known exactly, due to algorithmic optimization.

We have seen how easy it is to phrase some combinatorial problems in a way that makes using an evolutionary algorithm the obvious approach. More surprisingly, this can also lead to insights that allow classical mathematics back into the research loop. We can gain more from evolutionary algorithms than just the outcomes of their computations.