Pages

Saturday, March 31, 2012

21. The Nature of Information


How did life emerge from non-life? It did so through a long succession of processes and events in which more complex structures evolved from simpler ones. Beginning with this post, I shall take you through a fascinating journey of easy comprehension, explaining how complexity science answers such questions. This and the next few posts will introduce some of the jargon and basic concepts in modern complexity science.


The second law of thermodynamics for open systems is the primary organizing principle for all natural phenomena (cf. Part 6). The relentless expansion and cooling of our universe has been creating gradients of various types, which tend to get annulled as the blind forces of Nature take the local systems towards old or new equilibria. New patterns and structures get created when new equilibrium structures arise.

Take any living entity; say the human body, or even a single-celled organism. The amount of information needed for describing the structure of a single biological cell is far more than the information needed to describe, say, an atom or a molecule. The technical term one introduces here is 'complexity'. We say that the biological cell has a much higher DEGREE OF COMPLEXITY than an atom or a molecule.

Let us tentatively define the degree of complexity of any object or system as the amount of information needed for describing its structure and function.

Since the degree of complexity has been defined in terms of 'amount of information', we should be clear about the formal meaning of 'information'. The word 'bit' is commonplace in this IT age. It was introduced by Claude Shannon in 1948, and is the short form for 'binary digit'. A bit has two states: 0 or 1. Shannon took the bit as the unit of information.


One bit is the quantity of information needed (it is the 'missing' or 'not-yet-available' information) for deciding between two equally likely possibilities (for example, whether the toss of a coin will be 'heads' or 'tails'). And the information content of a system is the minimum number of bits needed for a description of the system.

The term ‘missing information’ is assigned a numerical measure by defining it as the uncertainty in the outcome of an experiment yet to be carried out. The uncertainty may be high either because only one of a large number (Ns) of outcomes is possible, or, what is the same thing, the probability of a particular outcome is inherently low.

Suppose we have a special coin with heads on both sides. What is the probability that the result of a spin of the coin will be 'heads'? The answer is 100% or 1; i.e., certainty. Thus the carrying out of this experiment gives us zero information. We were certain of the outcome, and we got that outcome; there was no missing information.

Next, we repeat the experiment with a normal, unbiased, two-sided coin. There are two possible outcomes now (Ns = 2). In this case the actual outcome gives us information which we did not have before the experiment was carried out (i.e., we get the missing information).

Suppose we toss two coins instead of one. Now there are four possible outcomes (Ns = 4). Therefore, any particular experiment here gives us even more information than in the two situations above. Thus: Low probability means high missing information, and vice versa.

To assign a numerical measure to information, we would like the following criteria to be met:

1. Since (missing) information (I) depends on Ns, the definition of information should be such that, if we are dealing with a combination of two or more systems, Ns for the composite system should be correctly accounted for in the definition of information. For example, for the case of two dice tossed together or successively, Ns should be 6 x 6 = 36, and not 6 + 6 = 12.

2. Information I for a composite or 'multivariate' system should be a sum (and not, say, a multiplication) of the information for the components comprising the system.

The following relationship meets these two requirements:

Ns ~ baseI.

Let us see how. Suppose system X has Nx states and the outcome of an experiment gives information Ix. Let Ny and Iy be the corresponding quantities for system Y. For the composite system comprising of X and Y, we get NxNy ~ baseIx baseIy. Since Ns = NxNy, we can write

Ns ~ base(Ix + Iy).

Taking logarithms of both sides, and writing Ix + Iy = I, we get

logNs ~ I log(base)

or

I ~ logNs / log(base).

What kind of logarithm we take (base = 10, 2, or e), and what proportionality constant we select, is a matter of context. All such choices differ only by some scale factor, and units. The important thing is that this approach for the definition of information has given us a correct accounting of the number of states, 'bins', or classes (i.e. by a multiplication (NxNy) of the individual states), and a correct accounting of the individual measures of information (i.e. by addition).

All the cases considered above are equiprobability cases: When the die is thrown, the probability P1 that the face with ‘1’ will show up is 1/6, as are the probabilities P2, P3, .. P6 that '2', '3', .. '6' will show up. For such examples, the constant probability P is simply the reciprocal of the number of possible outcomes, classes, or bins; i.e., Ns:

P = 1 / Ns.

Substituting this in the above relation, we get

I ~ log (1/P) / log (base).

Introducing a suitable proportionality constant c, we can write

I = c log (1/P).

This is close to the SHANNON FORMULA for missing information. Even more pertinently, this equation is similar to the famous Boltzmann equation for entropy:

S = k log W.

Entropy has the same meaning as missing information, or uncertainty. More on this next time.