Untangling Complex Systems, page 82
6 The satisfiability (SAT) problem consists of a logical propositional formula with N variables, and it requires to find a value (true or false) for each variable that makes the formula true. For example, the formula “x AND NOT y” is satisfiable because the values x = TRUE and y = FALSE makes “x AND NOT y” = TRUE. In contrast, “y AND NOT y” is not
satisfiable. When the formula consists of a conjunction of clauses, and each clause is a disjunction of k variables, any of which can be negated, the problems is called k-SAT. For k > 2, the k-SAT is NP-complete.
Complex Systems
419
NP problems
NP-complete
problems
P problems
If NP = P
P problems
FIGURE 12.2 The three main classes of computational problems (on the left) that would become just one if
it was demonstrated that NP = P.
Aware of the relevance of the Computational Complexity theory, in 2000, the Clay Mathematics
Institute in Cambridge, Massachusetts, has named “P versus NP” as one of its “Millennium” prob-
lems, and it offers one million dollars to anyone who provides verified proof that either NP = P or
NP ≠ P . If the relation NP = P were demonstrated to be true, then, as the same Gödel said in a letter to Von Neumann, in 1956, that discovery would have “consequences of the greatest magnitude” (Sipser
1992). Our life would not be the same (Fortnow 2009). Everything would be much more efficient. The
transportation schedules of all forms of transit would be optimized, allowing people and goods to move
quicker and cheaper. Manufacturers and businessmen would improve their production processes and
increase profits. It would become much easier to find new effective treatments for incurable diseases,
make weather forecasts, predict catastrophic events, and the trends of stock markets. Humanity would
have new powerful algorithms and tools to tackle all the Complexity Challenges.
The discovery that NP = P would bring benefit (Fortnow 2009) to another challenge we encoun-
ter in dealing with Natural Complex Systems. It is the challenge of formulating universally valid
and effective algorithms for recognizing variable patterns. Variable patterns are objects (both ani-
mate and inanimate) or events whose recognition is made difficult by their multiple features (since
they contain a significant amount of information), variability, and extreme sensitivity on the context.
Examples are human faces, voices, and fingerprints (biometrics), handwritten cursive words and
numbers, patterns in medical diagnosis, patterns in apparently uncorrelated scientific data (data
mining), and chaotic and stochastic time series. Human beings are good at recognizing some vari-
able patterns, such as faces and voices. However, we need to formulate algorithms for identifying
every type of pattern, whatever is their context. The steps of pattern recognition are: (I) data acqui-
sition by instruments; (II) selection of the features (any pattern is represented by a k-dimensional
vector if k is the number of selected features); (III) application of an algorithm for the classification
step. In the literature (Bishop 2006), there are different examples of algorithms that have been pro-
posed, such as statistical, structural, or based on artificial neural networks (Jain et al. 2000). All of
them still suffer in universality and effectiveness.
12.3 IF IT WERE NP = P, WOULD BE THE COMPLEXITY CHALLENGES
SURELY WON?
In the previous paragraph, we knew the computational hurdles science encounters whenever it tries
to describe Complex Systems. Even if, one day, someone demonstrated that NP = P, one relevant
limit in the scientific knowledge would remain. It is the limit in determining the initial conditions
for every Complex System. The famous Uncertainty Principle formulated, at first, by Heisenberg
(1927), and then extended by others (Erhart et al. 2012; Furuta 2012), declares the impossibility
of determining, accurately and simultaneously, position and momentum of a microscopic particle.
The Uncertainty Principle places limits to the deterministic dream of describing the dynamics
of the entire Universe starting from the description of its tiny particles. We might get satisfied by
describing Complex Systems from the macroscopic point of view, neglecting their ultimate atomic
420
Untangling Complex Systems
constituents. However, the intrinsic unpredictability of chaotic dynamics, in the long term and at
both the microscopic and macroscopic level, would remain valid. In fact, any chaotic dynamics is
extremely sensitive to the initial conditions. And we are fully aware that the initial conditions of both
microscopic and macroscopic systems cannot be determined with an infinite degree of accuracy
because any experimental measurement is affected by unavoidable uncertainties (see Appendix D).
I think that we are now completely aware of the sharp limits we encounter in describing Natural
Complex Systems by using the fundamental physical laws. Therefore, we need to develop models to
interpret Natural Complexity. The first requirement for formulating a scientific model is to know, in
depth, the features of the systems we want to describe. Therefore, in the next pages of this chapter,
I present the characteristics of Complex Systems.
12.4 THE FEATURES OF COMPLEX SYSTEMS
The Universe is the most extensive Complex System. Within our Universe, all the galaxies are
Complex Systems. Within each galaxy, all the stars and planets are Complex Systems, as well. If we
assume that life is present only on the Earth (so far, we do not have proofs in contrast to this hypoth-
esis), then, we can plainly state that the Earth is the most complex planet in the Universe. What
makes our planet complex is its climate, its geology, and, particularly, its biosphere. The biosphere
is the global ecological system that consists of all the living beings, their relationships, and their
interactions with the geosphere, hydrosphere, and atmosphere. Every ecosystem included in the
biosphere is complex. Even every living being, either pluricellular or even unicellular, is complex.
Among the living creatures, we, humans, are the most complex, mainly due to our brains, which
confer us the power of computing with numbers and words. Of course, even human societies and the
economy are other examples of Complex Systems.
Which are the common features of so diverse Complex Systems?
12.4.1 neTworks
A Complex System consists of many elements, often diverse, if not unique. For example, the compo-
nents of an ecosystem are plainly diverse, and those belonging to a human community are evidently
unique. The elements of a Complex System are also highly interconnected. Complex Systems are
networks. They are intertwined systems. In fact, the etymology of the adjective “Complex” derives
from the Latin verb cum-plectere that means “to intertwine together.” It is different from the ety-
mology of “complicated.” “Complicated” derives from the Latin verb cum-plicare that means “to
fold together.” Anything that is “complicated,” is “folded” and can be “unfolded” (see Figure 12.3).
On the other hand, anything that is “complex,” is interwoven and cannot be “unfolded.” Instead, it
needs to be untangled.
A network is described as a collection of nodes (also called vertices) connected by links (or
edges) (Newman 2010). The nodes represent the elements of the network, and the links are the con-
nections between them (see Figure 12.4). For instance, if we consider our brain, a node is a neuron, whereas a link is either a connection between the synapse of a neuron and the dendrite of another
or a gap junction.7 In an ecosystem, the nodes could be the single species, and the links could be the ecological relationships between them. In economy, the nodes are the different individual agents
that can be firms, banks, or countries, and links are their mutual interactions, be it trade, ownership,
credit-debt relationships, or research and development alliances.
In undirected networks (see Figure 12.4a), the links between nodes do not have an assigned
direction. In directed networks, the interplay between any two nodes has a well-defined direction,
indicated by an arrow (see Figure 12.4b).
7 A gap junction is an aggregate of intercellular channels that permits transfer of ions and small molecules.
Complex Systems
421
FIGURE 12.3 Examples of “complicated” or “folded” objects, on the left, and a “complex” or “intertwined”
basket, on the right.
Node
Link
(a)
(b)
FIGURE 12.4 Simple undirected (a) and directed (b) networks.
An essential property of a node within a network is its connectivity or degree, d, which is the
number of links the node has to other nodes. For example, in (a), the grey node has degree d = 4. In
a directed network, we can define an incoming degree, din, which denotes the number of links that
point to a node ( din = 3 for the grey node in network b), and an outgoing degree, dout, which indicates the number of links that starts from it ( dout = 1 for the grey node). It is useful to define the degree
distribution, P( d), of a network. P( d) is obtained by counting the number of nodes with d = 1,2 …
, , m
and dividing each value by the total number of nodes, N. The degree distribution of the small net-
work, depicted in Figure 12.4a, is reported in Figure 12.5. An undirected network with L links is characterized by an average degree d = 2 L/ N . High-degree nodes are called hubs.
Another relevant feature of a network is the distance between nodes, which is measured by
counting the number of links we need to pass through to go from one node to another. Usually, there
are many alternative paths between two nodes. Among them, the most interesting is the shortest
path, sp, which is the path with the smallest number of links between the selected nodes. For exam-
ple, the shortest path between the white and grey nodes of the network represented in Figure 12.4a, is 2. In the directed network Figure 12.4b, the shortest path from the white node to the grey node, sp , is 2 and is different from the shortest path from the grey to the white, sp , which is equal to 3.
WG
GW
The overall navigability of a network is described by specifying its mean path length, sp, which is
the average over the shortest paths between all pairs of nodes.
422
Untangling Complex Systems
0.6
0.5
y 0.4
0.3
equencFr 0.2
0.1
0.0 0
1
2
3
4
Degree
FIGURE 12.5 Degree distribution of the small network depicted in Figure 12.4(a).
A final attribute of a network is its clustering. The clustering coefficient C of a node represents
the fraction of pairs of neighbors that are connected to one another. If a node has d neighbors, there
are d ( d − )
1 / 2 pairs of neighbors. C is the fraction of the d ( d − )
1 / 2 pairs that are linked. For
example, for the grey node of the network in Figure 12.4a, C = 2 6, whereas for the white node, C = 0. The average clustering coefficient, C, of a network quantifies the tendency of the nodes of
that network to form clusters or groups.
TRY EXERCISE 12.2
After an analysis of the structures and properties of many real networks, scientists have proposed six
main models. The first is the model of regular networks (or the so-called regular lattice structures).
An example is shown in Figure 12.6, and it is labeled as a. In a, we have 18 nodes. Each node is linked to its four nearest neighbors. The average clustering coefficient is 0.5, and the average path length is
2.5. In general, in a regular network, all the nodes have the same degree and P( d), plotted as a histo-gram, is just a single vertical segment. The mean path length, sp, is long and the clustering is high.
P( d)
log( P( d))
(a)
d
(d)
log( d)
P( d)
P( d)
(b)
d
(e)
d
P( d)
log( P( d))
d
log( d)
(c)
(f)
FIGURE 12.6 Six models of networks and their characteristic degree distributions. (a–f) represent regular,
small-world, random, scale-free, modular, and hierarchical networks, respectively.
Complex Systems
423
If we substitute a few of the nearest-neighbor links of a into long-distance links, chosen randomly,
we obtain the second model, a small-world network, like that labeled as b in Figure 12.6. In a small-world network (Watts and Strogatz 1998) sp, becomes short because the long-range edges are “short-
cuts” connecting vertices that would otherwise be much farther apart. Nonetheless, the nodes of the
network maintain high clustering coefficients and P( d) is still a discrete distribution. When the links of a regular network are completely randomized, we obtain the third model, a random network like
that labeled as c in Figure 12.6. In a random network, the degree distribution is a Poisson function (Erdös and Rényi 1960) peaked at d and decaying exponentially for d d . This trend implies that most of the nodes have approximately the same number of links, close to the average d. Since the tail
of P(d) decreases exponentially, nodes with degree larger than d are extremely rare. The mean path
length is short, like in a small-world network, but the clustering coefficient is low and independent of
a node’s degree, so C( d) appears as a horizontal line when plotted as a function of d (Barabási and Oltvai 2004).
TRY EXERCISE 12.3
The fourth model is the scale-free network (see d in Figure 12.6) that is characterized by a power-law degree distribution. The probability that a node has degree d is P( d ) = d−γ, where γ is a positive constant, usually included between 2 and 3 (Barabási and Oltvai 2004). The power-law distribution
entails that most of the nodes have low degree, whereas there are just a few hubs. The scale-free
network has a fractal-like structure and the features that are typical of a small-world network, i.e.,
short mean path length and high average clustering coefficient.
The fifth model is the modular network (see e in Figure 12.6), where one can easily identify groups of nodes that are highly interconnected with each other (the so-called modules) but have only a few or
no links to nodes outside of the group to which they belong to. A modular network is characterized by
a discrete degree distribution because nodes within single modules have robust connectivity, whereas
those bridging modules have sparse connectivity. The average cluster coefficient is significant.
TRY EXERCISE 12.4
The sixth model is the hierarchical network with clusters of nodes that combine in an iterative man-
ner (Ravasz et al. 2002). If we look carefully at the graphical example labeled as f in Figure 12.6, we notice nodes that appear in the center of the network have the highest degree and the lowest
clustering coefficient, while the nodes at the periphery of the network have low degrees and the
most significant clustering coefficients. In between are nodes with a moderate degree and moderate
clustering. It results that both P( d) and C( d) are power-law functions.
TRY EXERCISE 12.5
Most real networks are difficult to deal with, for different reasons (Strogatz 2001). First, their
structures can be intricate tangles, and they may continuously evolve. Second, their connections can
have different strengths and effects, and they may change, as well. Third, and most often, nodes
are diverse, and their internal state can vary in time in complicated, non-linear, ways. An example
of an incredibly dynamic network is the immune system that protects our bodies from infections
and intruders (antigens) (read Box 12.1 in this chapter). Despite all these difficulties, the study of
many biological, technological and social networks has demonstrated the ubiquity of small-world
properties. Networks as diverse as the cell, the brain, the food web, the internet, the World Wide Web,8
8 The Internet is the network of computers and routers with the wires and cables, which physically connect them, as edges.
The World Wide Web is the network of web pages where the edges are the hyperlinks (URL’s) that point from one page
to another.
424
Untangling Complex Systems
collaboration of movie actors, collaboration of scientists, the connections of bank, international trade
relationships, the words in human language, and so on, all exhibit short mean path lengths and high
average clustering coefficients with respect to hypothetical random networks having the same number
of nodes (Albert and Barabási 2002; Schweitzer et al. 2009). Small-world features guarantee an
enhanced signal-propagation speed, computational power, spread of fads and infectious diseases, and
synchronizability (Watts and Strogatz 1998). Among the investigated real networks, most of them
exhibit a degree distribution that follows a power law. This evidence suggests that many Complex
