Untangling Complex Systems (Pier Luigi Gentili) » p.91 » Global Archive Voiced Books Online Free

Untangling Complex Systems, page 91

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112

the complement of the first name of the end city (GGCT, for London). The first primer induced

DNA polymerase to copy complements of sequences that had the right start city, and the second

primer initiated the duplication of strands that had the correct end city. PCR proceeds through

thermocycling. Very hot conditions induce the denaturation of double-stranded DNA. Less warm

conditions are favorable for DNA polymerase to duplicate the individual pieces. The result was

that strands with both the right start and end compositions were duplicated at an exponential rate.

On the other hand, strands that had the right start city but not the right end city, or vice versa,

were duplicated much more slowly. DNA sequences that had neither the right start nor the right

end were not duplicated at all. In this manner, the first stage of the flowchart in Figure 13.8 was accomplished. Then, by using the gel electrophoresis technique that separates DNA molecules

having different length through an electric field, Adleman identified those strands that had the

right length (consisting of 24 bases for the example of Figure 13.10). This operation completed the second step of the flowchart. To check if the remaining sequences passed through all the intermediary cities, Adleman used a procedure known as affinity separation. Many copies of the sequence

that is the complementary name of Berlin were attached to microscopic iron beads. These tagged

beads were suspended in the test tube containing the remaining strands, under conditions favorable

for the hybridization reaction. Only those strands that contained the Berlin’s sequence annealed

to the segments bound to the beads. Then, through a magnet placed in contact with the wall of the

test tube, Adleman attracted the beads and could discard the liquid containing strands that did not

have the desired city’s name. Then, he added new solvent and removed the magnet to resuspend the

beads. By heating, he induced the strands to detach from the beads. Next, he reapplied the magnet

to attract the beads to the wall of the test tube. The liquid, which now contained the desired DNA

strands, was recovered for further screening. The process was repeated for the other intermediary

cities, such as Madrid as far as the example of Figure 13.10 is concerned. After finishing the affinity separations, the final step of the flowchart of Figure 13.8 was over, and Adleman was sure that the DNA strands left in the test tube should be those encoding the solution of the Hamiltonian Path

Problem. This result was confirmed by an additional PCR, followed by another gel-electrophoresis

separation. Although the DNA found the solution just in one second, the entire procedure of sepa-

ration and recognition of the right solution required seven days in the laboratory. Therefore, the

Adleman’s idea is brilliant, but the procedure is impractical for solving more massive Hamiltonian

Path Problems (Linial and Linial 1995). In fact, the mass of DNA needed for the computation

would become too big, and the exhaustive search for the right solution would be like to find a

How to Untangle Complex Systems?

467

small needle in an enormous haystack. Moreover, the search may be hindered by the unavoidable

artifacts that PCR can introduce in the amplification stages.

Probably, DNA computing is more promising in vivo, inside living cells. The dream is to imple-

ment a “chemical robot” or what has also been called a “Doctor in a cell” by Ehud Shapiro from

the Weizmann Institute. The “Doctor in a cell” consists of an autonomous molecular computing

device that, after analyzing the surrounding cellular microenvironment, it decides if to release or

not a specific drug (Benenson et al. 2004). For example, an oligonucleotide AND gate has been

engineered to respond to specific microRNA inputs, which are indicators of cancer, by a fluorescent

output. The DNA AND gate has been implanted in alive mammalian cells (Hemphill and Deiters

2013). In other examples, chemical robots have been implemented by exploiting the molecular self-

assembly of DNA that can originate non-arbitrary two- and three-dimensional shapes at the nano-

meter scale, called “scaffolded DNA origami” (Rothemund 2006). For example, nanoscale robots,

based on DNA origami, are capable of dynamically interacting with each other in the cells of a

living cockroach (Amir et al. 2014). The interactions generate logic outputs, which are relayed to

switch molecular payloads on or off. The payload can be a molecule such as a short DNA sequence,

an antibody, or an enzyme. If the payload can activate or inactivate another robot, this will create a

circuit inside a living cell.

RNA is another promising material for computing within living beings. In fact, RNA is involved

in many regulatory networks. Moreover, RNA is structurally more versatile and more thermody-

namically stable than DNA. It displays a structural flexibility and functional diversity similar to

that of proteins, including enzymatic activities. The discovery of diverse RNA functions and the

methods to produce fluorogenic RNA in a cell suggest that small RNAs can work cooperatively,

synergistically or antagonistically to carry out computational logic circuits. With all these advan-

tages, RNA computation is promising, but it is still in its infancy (Qiu et al. 2013).

DNA material has another important application: it could be used as a memory of data, maybe of

the overwhelming Big Data. Life on Earth teaches us that DNA is an alluring material for informa-

tion storage. In fact, a whole genome that fits into a microscopic cell contains all the instructions

required by a living being to survive. Moreover, if the genome is held in dry and dark conditions,

it can remain almost intact hundreds, thousands, and even millions of years (Pääbo et al. 2004). So,

DNA is ultracompact and long-lasting storage material. Scientists have been storing digital data in

DNA since 2012. That was when George Church and his colleagues encoded a 53,426-word book

using less than a picogram of DNA (Church et al. 2012). For encoding the digital file, researchers

have divided it into tiny blocks of data and converted these data into DNA’s four-letter alphabet. They

encoded one bit per base: A or C for zero, G or T for one. Each DNA fragment was replicated thou-

sands of times. Since errors in synthesis and sequencing are rarely coincident, each molecular copy

corrects errors in the other copies. Moreover, each DNA fragment contained a digital “barcode” that

recorded its location in the original file. Reading the data required a DNA sequencer and a computer

to reassemble all of the fragments in order and convert them back into digital format. Density, stabil-

ity, and energy efficiency are all potential advantages of DNA storage (Extance 2016), while costs

and times for writing and reading are currently impractical. So, the DNA storage approach is not

promising if data are needed instantly, but it would be better suited for archival applications.

TRY EXERCISE 13.1

13.3.1.4 Evolutionary Computing

Biological evolution is the change in life forms that occurred over the last four billion years, and

that continues to happen on Earth. Darwin (1859) supposed that the changes were gradual, but the

fossil record also suggests that long periods of no change in the morphology of organisms have been

punctuated by relatively short periods of significant change, with the disappearance of some spe-

cies and the emergence of new ones (Eldredge and Gould 1972). The evolution of biological forms

468

Untangling Complex Systems

Self-organizing

living matter in

its original

form

Slow changes of the

environmental conditions

Global historical events

induce epigenetic events

e.g. impact of large meteorites

Local historical events

e.g. random genetic mutations

induce changes

Self-organizing

living matter in its

new form

Natural

selection

It survives in

OR

It dies

the offspring

FIGURE 13.12 The main factors contributing to the evolution of biological forms.

is both gradual and sudden, and it depends on more than one factor (see Figure 13.12). First, every living being has the power of self-organizing, of creating order within itself, with the tendency

of becoming always more complex (Kauffmann 1993). The evolution towards new forms may be

induced by the slow changes of the environmental conditions that trigger epigenetic events: some

genetic switches are turned entirely or partly off, and some others are completely or partially turned

on. However, the evolution may also be induced by either global historical events (such as the impact

of large meteorites on Earth) that provoke significant changes in the environment, or local histori-

cal events such as “random” genetic mutations that are sporadic and impossible to predict. The new

biological forms are tested continuously on their power of fitting to their ecosystem through natural

selection: those that are successful in surviving and reproducing continue to exist through its off-

spring. Sexual reproduction favors the variability because the offspring have a genetic code that is

the mixture of those of the parents.

The idea of imitating the basic features that govern the natural evolutionary processes in com-

puter science came onto the scene by the early 1960s. Such research is known as Evolutionary

Computation. There are different mainstreams, but the most known is Genetic Algorithm (GA)

proposed by Holland (1975). The input to a GA consists of a population of candidate programs.

A candidate program is written as strings of bits, numbers or symbols to accomplish a task. The

programs can be generated randomly. Each program is tested on how well it works on the desired

task by determining a fitness value. The programs that have the highest fitness values are selected

to be parents of the next generation. The offspring is generated by merging parts of the parents. The

new programs are then tested in their fitness, and the entire procedure is repeated over and over

again until we get satisfied with the achieved fitness values. Evolutionary Computation and Genetic

Algorithms have been used to solve problems in many scientific and engineering areas, as well as in

art and architecture (Beasley 2000).

13.3.1.5 Artificial Immune Systems

Our immune system has an innate and an adaptive component (read Box 13.1 in Chapter 12 for more

details), which has a diverse set of cells, molecules, and organs that work in concert to protect our

body. Our immune system has some attributes that are worthwhile mimicking. First, the pattern

How to Untangle Complex Systems?

469

recognition ability of the receptors. A healthy immune system is very effective in distinguishing

harmful substances from the body’s own components and directing its actions accordingly. Second,

the immune subsystems, especially the adaptive one, exhibit the ability to learn and remember

previously encountered pathogens. When a new pathogen appears, the Immune System learns how

to defeat it (the primary response). The next time that pathogen is encountered, a faster and often

more aggressive response is elicited (the secondary response). Finally, the immune system is spread

throughout the body and acts collectively without any central controller.

Artificial immune systems are designed by following a procedure (de Castro and Timmis 2002)

that requires the definition of the (1) representation, (2) affinity measure, and (3) immune algorithm.

The first step is the selection of an abstract representation of the components of the system, consid-

ering the application domain. Any immune response requires the shape recognition of an antigen

by the cell receptors. For modeling this shape recognition process, the concept of shape-space has

been introduced (Perelson and Oster 1979). The shape-space approach assumes that the properties

of the receptors and antigens can be represented by a data structure. Most often, the data structure

is an attribute string, which can be a real-valued vector, a binary string, et cetera (de Castro 2007).

The affinity of a receptor towards either an antigen or an element of the organism is measured by

calculating the distance between the two corresponding attribute strings in the shape-space. Finally,

an immune-inspired algorithm is selected. Various types of immune-inspired algorithms exist. The

choice must be taken considering the problem that has to be solved. For example, clonal selection-

based algorithms attempt to capture mechanisms of the antigen-driven proliferation of B-cells,

which induce an improvement in their binding abilities. Clonal selection algorithms capture the

properties of learning, memory, adaptation, and pattern recognition (Timmis et al. 2008). Other

examples are the immune network algorithms (Jerne 1974) that view the immune system as a regu-

lated network of molecules and cells that recognize each other. Antibodies and cells are nodes of the

network, and the training algorithm determines the growth or the prune of edges between the nodes,

based on their mutual affinities. The immune network acts in a self-organizing manner and gener-

ates memory effects. Immune network algorithms have been used in clustering, data visualization,

control, and optimization domains. In general, the theory of artificial immune systems finds broad

applicability in machine-learning, pattern recognition, security of information systems, and data

analysis (de Castro 2007).

13.3.1.6 Cellular Automata

A cellular automaton is a mathematical model of a dynamical system that is discrete in both space

and time. In fact, a d-dimensional cellular automaton is a d-dimensional grid of cells, each of which can take on a value from a finite set of integers (most often, just two values, 1 or 0, on or off). The

value of each cell at time step t + 1 is a function of the values of a small local neighborhood of cells

at time t. The cells update their state simultaneously according to a given local rule. The idea of

cellular automata came to John von Neumann, back in the 1940s, based on a suggestion made by

his colleague, the mathematician and nuclear physicist Stanislaw Ulam (both of them participated

to the Manhattan Project during the Second World War). von Neumann was trying to implement the

self-reproduction phenomenon, which is peculiar to the living matter, in machines, and he thought

to succeed by inventing cellular automata. In fact, cellular automata are examples of mathematical

models constructed from many simple identical components that, together, are capable of complex

behavior. This instance has been demonstrated by the physicist Stephan Wolfram (2002) by using

the simplest possible form that is one-dimensional, two-state cellular automaton. It consists of a line

of sites, with each site carrying a value ( ai) 0 or 1, white or black, dead or alive. The value ai of the site at each position i is updated in discrete time steps according to an identical deterministic rule

that is a function of the values ak of the neighbors:

a( t+ )1

( t)

( t)

( 1, 1)

i

= f ai− ai+ [13.3]

470

Untangling Complex Systems

Class 1

Class 2

0

5

5

10

10

15

15

20

20

25

25

30

30

35

35

40

40

45

45

50

50

Class 3

Class 4

5

5

10

10

15

15

20

20

25

25

30

30

35

35

40

40

45

45

50

50

FIGURE 13.13 Examples of the four classes of spatiotemporal patterns generated by one-dimensional cel-

lular automata following deterministic rules. These patterns have been achieved by using “ElementaryCAs”

model, Complexity Explorer project, http://complexityexplorer.org.

Different cellular automaton rules yield very different patterns. Different initial states with a par-

ticular cellular automaton rule yield patterns that differ in the details but are similar in form and

statistical features. All the possible patterns can be grouped into four classes (see Figure 13.13).

• Class 1: Independently of the initial configurations, all cells in the lattice switch to the off

state and stay that way, indefinitely. The final pattern is spatially homogeneous.

• Class 2: The final patterns are uniform or periodic, depending on the initial configuration.

• Class 3: Aperiodic and chaotic behavior.

Untangling Complex Systems, page 91

Other author's books: