Thursday, November 14, 2013

106. Scale-Free Networks

'This unparalleled license of expression, coupled with diminished publishing costs, makes the Web the ultimate forum of democracy; everybody’s voice can be heard with equal opportunity. Or so insist constitutional lawyers and glossy business magazines. If the web were a random network, they would be right. But it is not. The most intriguing result of our Web-mapping project was the complete absence of democracy, fairness, and egalitarian values on the Web. We learned that the topology of the Web prevents us from seeing anything but a mere handful of the billion documents out there' (Barabási, Linked).

A scale-free network is one in which a few of the nodes have a much larger number of connections than others. For regular networks and for random networks the number of edges per node has an approximately Gaussian distribution, with a mean value that is a measure of the scale of the network. In a scale-free network, by contrast, there are a few strongly connected nodes and a large number of weakly connected ones. Typically, the degree distribution follows a power law:

P(k) ~ k

The exponent γ generally lies between 2 and 3. Most of the complex networks in Nature have a power-law degree distribution (cf. Part 83).

Since the distribution function P(k) does not show a characteristic peak (unlike the Gaussian peak for random networks and regular networks), the network is described as scale-free. For it, the average distance between any two nodes is almost as small as for a random network, and yet its clustering coefficient is practically as large as for a regular network. In other words, scale-free networks exhibit cliquishness, like small-world networks (cf. Part 105).

Power-law degree-distribution models for random networks

Often one can bring a random-network model closer to reality by introducing a power-law degree distribution into it. Such a topologically modified network is still random in the sense that the edges still connect randomly selected nodes. But a constraint is introduced that the degree distribution function P(k) must follow a power law (Albert and Barabási 2002).

Real-life networks evolve with time. It has been observed in a wide variety of complex networked systems that their topology evolves to achieve higher and higher degrees of robustness (see below) against attacks and failures (Albert and Barabási 2002; Barabási 2003). How does this happen? How does the power-law degree distribution arise? What mechanisms result in the coexistence of large clustering coefficients and small path lengths?

We have seen in Part 105 that the Watts-Strogatz model captures some of the important features of many real networks, even though it does not reproduce the power-law or scale-free feature characteristic of many real networks. The generalized random-graph model, on the other hand, is able to introduce power-law degree-distribution behaviour.

The ad hoc ways in which these models are introduced do not shed light on the underlying mechanisms operative in the evolution of the topology of real networks. What is needed is that, instead of modelling the network topology, we should try to model the assembly of the network and its evolution. This was first done by Barabási and Albert (1999).

The Barabási-Albert model

The BA model captures some essential features of network dynamics, and can reproduce the scale-free connectivity exhibited by many real networks, by doing away with the two assumptions made in the models we have discussed so far. We have assumed that the order N of a network is constant. We have also assumed that the connection probability p does not depend on which two nodes are selected for connection. What usually happens in the real world, however, is that, firstly, networks grow with time by the addition of new nodes (e.g. the World Wide Web grows exponentially with time by the continuous addition of new web pages); and secondly, there is likelihood of preferential attachment to certain nodes (e.g. when a new manuscript is prepared for publication, there is a greater chance that well-known papers will be cited, rather than less frequently cited average-quality papers).

The growth feature in the BA model is introduced as follows. One starts with a small number m0 of nodes, and at every time step a new node having m (≤m0) edges is added. Thus there are N = m0 + t nodes after t time steps.

And the preferential attachment of new nodes is carried out as follows: It is assumed that the probability Π that the new node will be connected to an existing node i is linearly related to the degree ki of the node i:

Π(ki) = ki / ∑i ki

Thus there are mt edges in the network after t time steps.

Numerical simulations demonstrate that the BA network evolves to be a scale-free network, with the probability that a node has k edges given by a power law with exponent γBA = 3.

The properties of real networks can differ substantially from what is predicted by the BA model. For example, the power-law exponent for the actual degree distribution may lie anywhere between 1 and 3. Even non-power-law features like an exponential cutoff, or saturation for small k, are observed sometimes. Evolving networks in Nature can develop both power-law and exponential degree distributions. Certain preferential attachments, aging effects, and growth constraints can make a network cross over from power-law to exponential-degree distribution. Various refinements of the BA model have been investigated (cf. Albert and Barabási 2002).

Robustness of networks

Networked complex adaptive systems display a remarkable degree of error tolerance. Naturally, this robustness aspect of networks has been investigated widely by computer simulation, as well as by analytical studies. The error tolerance of real networks has a dynamic aspect (Garlaschelli, Capocci and Caldarelli 2007) and a topological aspect. The topological aspect is easier to simulate on a computer. One may focus either on edges or on vertices, or on both. A network is defined as robust if it contains a giant cluster consisting of most of the nodes even after a fraction of its nodes are removed.

The progressive removal of edges randomly is like an inverse edge-percolation problem (cf. Part 104). The removal of a node, on the other hand, has a more damaging effect on the network topology, since all the edges adjacent with that node also get removed.

Simulation studies have shown a strong correlation between robustness and topology. Scale-free networks are found to be generally more robust than random networks. However, if the most connected nodes of a scale-free network are the targets of attack, then they are much more vulnerable than random networks (Albert, Jeong and Barabási 2000).