I introduced
algorithmic information theory (AIT) in Part 23. The fractal figure below needs ~1.6 million bits for specification
or storage. But it can be

*generated*by a very small program (requiring only a tiny fraction of this number of bits), using the definition of the Mandelbrot set and a few other specifications. We say that the figure has only a small algorithmic information content, even though it*looks*very complex.
Computation
involves three things:

1. The length of the computer program.

2. The time it takes to do the
computation.

3. The computer memory needed for the
job.

AIT largely
ignores the second and the third aspect, and recognizes only the first for
defining the information content of a given set of numbers or a given set of
data. In other words, it focuses on

*program-size complexity*.
Chaitin, who along
with Kolmogorov founded the subject of AIT, makes the point
that a theory can be likened to a computer program. The program calculates and
explains a certain set of observations, and the smaller this program is (in
terms of compression of information), the better is the theory (recall Ockham's razor).

When a set of
observations or data cannot be described compactly in terms of axioms and/or
theorems, there is no structure or order, or pattern, in the data. Such a set of
data is

*logically random*. Something is said to be random if the smallest program that calculates or generates it is the same size as it is, so there is no compression.
Chaitin has shown that certain facts are not just
computationally irreducible or incompressible; they are logically irreducible
or incompressible as well. The proof of their 'truth' must be in the form of
additional axioms, without any reasoning. So there are severe limits to the
powers of logic and reason.

Chaitin
introduced a number omega (Ω) to quantify
the degree of logical randomness of any system, and to show that the powers of
reasoning are limited. He demonstrated the existence of

*an infinite stream of unprovable mathematical facts*.
Let the term
'program' imply 'the concatenation of the computer program and the data to be
read in by the program'. Consider an ensemble of all such possible programs.
What is the probability that a program chosen at random from this set will ever
halt (cf. Part 66)?

*The number Ω is that probability.*
How do we
choose a program at random for testing this? A program is simply a succession
of bits (0s and 1s). Since we are considering all possible programs, any
succession of bits is a possible program for testing its halting behaviour. We
can flip a coin repeatedly to get a random sequence of bits. We go on adding
random bits, one at a time, till the sequence of bits is a program that halts,
if at all it can halt. The number Ω is the probability that the halting will
indeed occur (if at all) for the tested sequence of randomly generated bits.

These
operations, of course, assume the presence of a computing machine for doing the
job of testing. We also assume the use of a programming language. But it turns
out that the crucial conclusions about halting or otherwise do not depend on
these things: the actual values of Ω may depend on them, but not the general
conclusions drawn. Our arguments can proceed by assuming a particular computer
and a particular language for computing.

Since the
number Ω is a probability, it lies between 0 and 1. In binary notation it may
look something like 0.110010101… The central point made by Chaitin is that the
bits after the decimal point form an

*irreducible*stream. Every 0 or 1 after the decimal point represents a fact, and the totality of these bits represents irreducible mathematical facts.
The number Ω
can be regarded as an infinite sum. Each

*N*-bit program that halts contributes 1/2*to this sum. Each such program adds a 1 to the*^{N}*N*th bit. One may think that a precise value of Ω can be computed by adding all the bits for the programs that halt. This is not so. Although Ω is a perfectly well-defined specific number,*it is impossible to compute it in its entirety*(see below). It is possible to compute only a few digits of Ω. For example, if we know that computer programs 0, 10, and 110 all halt, then Ω = 0.111 up to three digits. But the first*N*digits of Ω cannot be calculated by a program of length significantly shorter than*N*. Knowing the first*N*digits of Ω will enable us to tell whether or not each program up to*N*digits in size ever halts. This means that at least an*N*-bit program is needed to calculate*N*bits of Ω.
Chaitin’s Ω
cannot be computed to arbitrarily high precision because if we know Ω exactly,
we can solve Turing’s halting problem, which is
actually unsolvable.

Given any finite
program, no matter how long, we have an infinite number of bits that the
program cannot compute. This implies that, given any finite set of axioms,
there are an infinite number of truths that are unprovable in that system. Ω is
irreducible.

Thus a theory
of everything for all of mathematics cannot exist. The number Ω has an infinite
number of bits or mathematical facts that cannot be derived from any principles
simpler than the string of bits itself.

Gödel’s work had shown that individual formal axiomatic
mathematical theories cannot prove the true numerical statement 'This statement
is unprovable'. According to Chaitin (2006), Gödel left unanswered the key
question: 'Is this an isolated phenomenon, or are there many important
mathematical truths that are unprovable?' It now turns out that the number Ω
provides an infinite number of true theorems that cannot be proved by any
finite system of axioms.

Leibniz had stipulated that if something (a theorem)
is true in mathematics, it is true for a reason, the reason being the proof of
the theorem. But the bits of Ω are totally random, and therefore these
mathematical truths are truths for no reason.