Untangling Complex Systems (Pier Luigi Gentili) » p.82 » Global Archive Voiced Books Online Free

Untangling Complex Systems, page 82

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112

6 The satisfiability (SAT) problem consists of a logical propositional formula with N variables, and it requires to find a value (true or false) for each variable that makes the formula true. For example, the formula “x AND NOT y” is satisfiable because the values x = TRUE and y = FALSE makes “x AND NOT y” = TRUE. In contrast, “y AND NOT y” is not

satisfiable. When the formula consists of a conjunction of clauses, and each clause is a disjunction of k variables, any of which can be negated, the problems is called k-SAT. For k > 2, the k-SAT is NP-complete.

Complex Systems

419

NP problems

NP-complete

problems

P problems

If NP = P

P problems

FIGURE 12.2 The three main classes of computational problems (on the left) that would become just one if

it was demonstrated that NP = P.

Aware of the relevance of the Computational Complexity theory, in 2000, the Clay Mathematics

Institute in Cambridge, Massachusetts, has named “P versus NP” as one of its “Millennium” prob-

lems, and it offers one million dollars to anyone who provides verified proof that either NP = P or

NP ≠ P . If the relation NP = P were demonstrated to be true, then, as the same Gödel said in a letter to Von Neumann, in 1956, that discovery would have “consequences of the greatest magnitude” (Sipser

1992). Our life would not be the same (Fortnow 2009). Everything would be much more efficient. The

transportation schedules of all forms of transit would be optimized, allowing people and goods to move

quicker and cheaper. Manufacturers and businessmen would improve their production processes and

increase profits. It would become much easier to find new effective treatments for incurable diseases,

make weather forecasts, predict catastrophic events, and the trends of stock markets. Humanity would

have new powerful algorithms and tools to tackle all the Complexity Challenges.

The discovery that NP = P would bring benefit (Fortnow 2009) to another challenge we encoun-

ter in dealing with Natural Complex Systems. It is the challenge of formulating universally valid

and effective algorithms for recognizing variable patterns. Variable patterns are objects (both ani-

mate and inanimate) or events whose recognition is made difficult by their multiple features (since

they contain a significant amount of information), variability, and extreme sensitivity on the context.

Examples are human faces, voices, and fingerprints (biometrics), handwritten cursive words and

numbers, patterns in medical diagnosis, patterns in apparently uncorrelated scientific data (data

mining), and chaotic and stochastic time series. Human beings are good at recognizing some vari-

able patterns, such as faces and voices. However, we need to formulate algorithms for identifying

every type of pattern, whatever is their context. The steps of pattern recognition are: (I) data acqui-

sition by instruments; (II) selection of the features (any pattern is represented by a k-dimensional

vector if k is the number of selected features); (III) application of an algorithm for the classification

step. In the literature (Bishop 2006), there are different examples of algorithms that have been pro-

posed, such as statistical, structural, or based on artificial neural networks (Jain et al. 2000). All of

them still suffer in universality and effectiveness.

12.3 IF IT WERE NP = P, WOULD BE THE COMPLEXITY CHALLENGES

SURELY WON?

In the previous paragraph, we knew the computational hurdles science encounters whenever it tries

to describe Complex Systems. Even if, one day, someone demonstrated that NP = P, one relevant

limit in the scientific knowledge would remain. It is the limit in determining the initial conditions

for every Complex System. The famous Uncertainty Principle formulated, at first, by Heisenberg

(1927), and then extended by others (Erhart et al. 2012; Furuta 2012), declares the impossibility

of determining, accurately and simultaneously, position and momentum of a microscopic particle.

The Uncertainty Principle places limits to the deterministic dream of describing the dynamics

of the entire Universe starting from the description of its tiny particles. We might get satisfied by

describing Complex Systems from the macroscopic point of view, neglecting their ultimate atomic

420

Untangling Complex Systems

constituents. However, the intrinsic unpredictability of chaotic dynamics, in the long term and at

both the microscopic and macroscopic level, would remain valid. In fact, any chaotic dynamics is

extremely sensitive to the initial conditions. And we are fully aware that the initial conditions of both

microscopic and macroscopic systems cannot be determined with an infinite degree of accuracy

because any experimental measurement is affected by unavoidable uncertainties (see Appendix D).

I think that we are now completely aware of the sharp limits we encounter in describing Natural

Complex Systems by using the fundamental physical laws. Therefore, we need to develop models to

interpret Natural Complexity. The first requirement for formulating a scientific model is to know, in

depth, the features of the systems we want to describe. Therefore, in the next pages of this chapter,

I present the characteristics of Complex Systems.

12.4 THE FEATURES OF COMPLEX SYSTEMS

The Universe is the most extensive Complex System. Within our Universe, all the galaxies are

Complex Systems. Within each galaxy, all the stars and planets are Complex Systems, as well. If we

assume that life is present only on the Earth (so far, we do not have proofs in contrast to this hypoth-

esis), then, we can plainly state that the Earth is the most complex planet in the Universe. What

makes our planet complex is its climate, its geology, and, particularly, its biosphere. The biosphere

is the global ecological system that consists of all the living beings, their relationships, and their

interactions with the geosphere, hydrosphere, and atmosphere. Every ecosystem included in the

biosphere is complex. Even every living being, either pluricellular or even unicellular, is complex.

Among the living creatures, we, humans, are the most complex, mainly due to our brains, which

confer us the power of computing with numbers and words. Of course, even human societies and the

economy are other examples of Complex Systems.

Which are the common features of so diverse Complex Systems?

12.4.1 neTworks

A Complex System consists of many elements, often diverse, if not unique. For example, the compo-

nents of an ecosystem are plainly diverse, and those belonging to a human community are evidently

unique. The elements of a Complex System are also highly interconnected. Complex Systems are

networks. They are intertwined systems. In fact, the etymology of the adjective “Complex” derives

from the Latin verb cum-plectere that means “to intertwine together.” It is different from the ety-

mology of “complicated.” “Complicated” derives from the Latin verb cum-plicare that means “to

fold together.” Anything that is “complicated,” is “folded” and can be “unfolded” (see Figure 12.3).

On the other hand, anything that is “complex,” is interwoven and cannot be “unfolded.” Instead, it

needs to be untangled.

A network is described as a collection of nodes (also called vertices) connected by links (or

edges) (Newman 2010). The nodes represent the elements of the network, and the links are the con-

nections between them (see Figure 12.4). For instance, if we consider our brain, a node is a neuron, whereas a link is either a connection between the synapse of a neuron and the dendrite of another

or a gap junction.7 In an ecosystem, the nodes could be the single species, and the links could be the ecological relationships between them. In economy, the nodes are the different individual agents

that can be firms, banks, or countries, and links are their mutual interactions, be it trade, ownership,

credit-debt relationships, or research and development alliances.

In undirected networks (see Figure 12.4a), the links between nodes do not have an assigned

direction. In directed networks, the interplay between any two nodes has a well-defined direction,

indicated by an arrow (see Figure 12.4b).

7 A gap junction is an aggregate of intercellular channels that permits transfer of ions and small molecules.

Complex Systems

421

FIGURE 12.3 Examples of “complicated” or “folded” objects, on the left, and a “complex” or “intertwined”

basket, on the right.

Node

Link

(a)

(b)

FIGURE 12.4 Simple undirected (a) and directed (b) networks.

An essential property of a node within a network is its connectivity or degree, d, which is the

number of links the node has to other nodes. For example, in (a), the grey node has degree d = 4. In

a directed network, we can define an incoming degree, din, which denotes the number of links that

point to a node ( din = 3 for the grey node in network b), and an outgoing degree, dout, which indicates the number of links that starts from it ( dout = 1 for the grey node). It is useful to define the degree

distribution, P( d), of a network. P( d) is obtained by counting the number of nodes with d = 1,2 …

, , m

and dividing each value by the total number of nodes, N. The degree distribution of the small net-

work, depicted in Figure 12.4a, is reported in Figure 12.5. An undirected network with L links is characterized by an average degree d = 2 L/ N . High-degree nodes are called hubs.

Another relevant feature of a network is the distance between nodes, which is measured by

counting the number of links we need to pass through to go from one node to another. Usually, there

are many alternative paths between two nodes. Among them, the most interesting is the shortest

path, sp, which is the path with the smallest number of links between the selected nodes. For exam-

ple, the shortest path between the white and grey nodes of the network represented in Figure 12.4a, is 2. In the directed network Figure 12.4b, the shortest path from the white node to the grey node, sp , is 2 and is different from the shortest path from the grey to the white, sp , which is equal to 3.

WG

GW

The overall navigability of a network is described by specifying its mean path length, sp, which is

the average over the shortest paths between all pairs of nodes.

422

Untangling Complex Systems

0.6

0.5

y 0.4

0.3

equencFr 0.2

0.1

0.0 0

1

2

3

4

Degree

FIGURE 12.5 Degree distribution of the small network depicted in Figure 12.4(a).

A final attribute of a network is its clustering. The clustering coefficient C of a node represents

the fraction of pairs of neighbors that are connected to one another. If a node has d neighbors, there

are d ( d − )

1 / 2 pairs of neighbors. C is the fraction of the d ( d − )

1 / 2 pairs that are linked. For

example, for the grey node of the network in Figure 12.4a, C = 2 6, whereas for the white node, C = 0. The average clustering coefficient, C, of a network quantifies the tendency of the nodes of

that network to form clusters or groups.

TRY EXERCISE 12.2

After an analysis of the structures and properties of many real networks, scientists have proposed six

main models. The first is the model of regular networks (or the so-called regular lattice structures).

An example is shown in Figure 12.6, and it is labeled as a. In a, we have 18 nodes. Each node is linked to its four nearest neighbors. The average clustering coefficient is 0.5, and the average path length is

2.5. In general, in a regular network, all the nodes have the same degree and P( d), plotted as a histo-gram, is just a single vertical segment. The mean path length, sp, is long and the clustering is high.

P( d)

log( P( d))

(a)

d

(d)

log( d)

P( d)

P( d)

(b)

d

(e)

d

P( d)

log( P( d))

d

log( d)

(c)

(f)

FIGURE 12.6 Six models of networks and their characteristic degree distributions. (a–f) represent regular,

small-world, random, scale-free, modular, and hierarchical networks, respectively.

Complex Systems

423

If we substitute a few of the nearest-neighbor links of a into long-distance links, chosen randomly,

we obtain the second model, a small-world network, like that labeled as b in Figure 12.6. In a small-world network (Watts and Strogatz 1998) sp, becomes short because the long-range edges are “short-

cuts” connecting vertices that would otherwise be much farther apart. Nonetheless, the nodes of the

network maintain high clustering coefficients and P( d) is still a discrete distribution. When the links of a regular network are completely randomized, we obtain the third model, a random network like

that labeled as c in Figure 12.6. In a random network, the degree distribution is a Poisson function (Erdös and Rényi 1960) peaked at d and decaying exponentially for d d . This trend implies that most of the nodes have approximately the same number of links, close to the average d. Since the tail

of P(d) decreases exponentially, nodes with degree larger than d are extremely rare. The mean path

length is short, like in a small-world network, but the clustering coefficient is low and independent of

a node’s degree, so C( d) appears as a horizontal line when plotted as a function of d (Barabási and Oltvai 2004).

TRY EXERCISE 12.3

The fourth model is the scale-free network (see d in Figure 12.6) that is characterized by a power-law degree distribution. The probability that a node has degree d is P( d ) = d−γ, where γ is a positive constant, usually included between 2 and 3 (Barabási and Oltvai 2004). The power-law distribution

entails that most of the nodes have low degree, whereas there are just a few hubs. The scale-free

network has a fractal-like structure and the features that are typical of a small-world network, i.e.,

short mean path length and high average clustering coefficient.

The fifth model is the modular network (see e in Figure 12.6), where one can easily identify groups of nodes that are highly interconnected with each other (the so-called modules) but have only a few or

no links to nodes outside of the group to which they belong to. A modular network is characterized by

a discrete degree distribution because nodes within single modules have robust connectivity, whereas

those bridging modules have sparse connectivity. The average cluster coefficient is significant.

TRY EXERCISE 12.4

The sixth model is the hierarchical network with clusters of nodes that combine in an iterative man-

ner (Ravasz et al. 2002). If we look carefully at the graphical example labeled as f in Figure 12.6, we notice nodes that appear in the center of the network have the highest degree and the lowest

clustering coefficient, while the nodes at the periphery of the network have low degrees and the

most significant clustering coefficients. In between are nodes with a moderate degree and moderate

clustering. It results that both P( d) and C( d) are power-law functions.

TRY EXERCISE 12.5

Most real networks are difficult to deal with, for different reasons (Strogatz 2001). First, their

structures can be intricate tangles, and they may continuously evolve. Second, their connections can

have different strengths and effects, and they may change, as well. Third, and most often, nodes

are diverse, and their internal state can vary in time in complicated, non-linear, ways. An example

of an incredibly dynamic network is the immune system that protects our bodies from infections

and intruders (antigens) (read Box 12.1 in this chapter). Despite all these difficulties, the study of

many biological, technological and social networks has demonstrated the ubiquity of small-world

properties. Networks as diverse as the cell, the brain, the food web, the internet, the World Wide Web,8

8 The Internet is the network of computers and routers with the wires and cables, which physically connect them, as edges.

The World Wide Web is the network of web pages where the edges are the hyperlinks (URL’s) that point from one page

to another.

424

Untangling Complex Systems

collaboration of movie actors, collaboration of scientists, the connections of bank, international trade

relationships, the words in human language, and so on, all exhibit short mean path lengths and high

average clustering coefficients with respect to hypothetical random networks having the same number

of nodes (Albert and Barabási 2002; Schweitzer et al. 2009). Small-world features guarantee an

enhanced signal-propagation speed, computational power, spread of fads and infectious diseases, and

synchronizability (Watts and Strogatz 1998). Among the investigated real networks, most of them

exhibit a degree distribution that follows a power law. This evidence suggests that many Complex

Untangling Complex Systems, page 82

Other author's books: