Untangling Complex Systems (Pier Luigi Gentili) » p.102 » Global Archive Voiced Books Online Free

Untangling Complex Systems, page 102

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112

B

∑

2

( − − i ) = 0

∂

i=1

From the equations [D.19], wherein wi = 1 2

/σ y , we achieve that the least-squares estimates of the

i

coefficients A and B are:

n

n

n

n



2  









wixi

∑



wiyi



yi 

1

∑

 − 

wixi

1

∑



wixi

1

 ∑

A

i=

i=

i=

i

= 



 



=1

 [D.20]

n

n

n





2

2 





wi

∑  wixi

1

∑

 − 

wixi

1

∑



i=

i=

i





 

=1



n

n

n

n













wi

∑  wixiyi





1

∑

 − 

wixi

1

∑



wiyi

1

∑

B

i=

i=

i=

i

= 



 



=1



 [D.21]



n



n

2

2 

n



∑ i

w 

i

w xi

i

w xi

i 1

∑

 − 

i 1

∑





=



=

 

i=1



When all the uncertainties in the y values are equal, the formula for the determination of A and B

i

are like equations [D.20] and [D.21] but with all the weights wi = 1. It may happen that we collect

each pair of points ( x , y ) just once, and we estimate only their systematic uncertainty due to the

i

i

equipment. In such cases, we still may use equations [D.20] and [D.21] to evaluate the A and B coef-

ficients. Moreover, we may also determine a random-type uncertainty σ in the data ( y

)

y

1 , y 2 , …, yn

by the following formula:

n

1

σ

2 [D.22]

2 ∑(

)

y =

i

y − A − Bxi

n − i=1

4 The procedure of minimizing the sum of the squares appearing in the exponent χ 2 gives the name to the method: the least-squares fitting.

Appendix D

529

Equation [D.22] is based on the assumption that each y is normally distributed about its true value

i

A + Bx with width parameter .5 After having found the uncertainty in the measured quantities i

σy

σy

( y

), we can calculate the uncertainties in

1 , y 2 , …, y

A and B by the error propagation formula. The

n

result is:

n x 2 i

∑

σ

i=1

A = σ y

[D.23]

n

n

2

n

2 





xi

∑ − xi

1

∑ 

i=

i



 

=1



n

σ B = σ y

[D.24]

n

n

2

n

2 





xi

∑ − xi

1

∑ 

i=

i



 

=1



The extent to which a set of points ( x

), (

), …, (

), supports a linear relation between the

1 , y 1

x 2 , y 2

x , y

n

n

variables x and y is measured by the correlation coefficient r

n

∑ ( x )( )

i − x

yi − y

σ

r

i=1

xy

=

=

[D.25]

n

n

∑

1/2

(

σ σ

x

2

2 

x

)

y

1

∑ ( )

i − x

yi − y

i=

i



=1



The possible values of the correlation coefficients are in the range −1 ≤ r ≤ +1. If the points are cor-

related, σ xy ≈ σ σ

x y and r ≈

1. On the other hand, if the points are uncorrelated, then σ xy ≈ 0 and

r ≈

0.

d.8 TesT χ 2

When we collect data about the value of a variable or a parameter, and we assume to describe them

through a specific function or distribution, we might need a parameter that quantifies the agreement

between the observed and the expected values. Such parameter exists, and it is the χ2:

n

observed value − expected value 2

χ2 =



∑

i

i



 [D.26]



standard deviation



i 1

=

If the agreement is good, χ2 will be of the order of the number of degrees of freedom ( d), i.e., the number n of the collected data minus the number of the parameters calculated through the data. If

the agreement is poor, χ2 will be much larger than d. A slightly more convenient way to think about

this test is to introduce the reduced chi-squared ( χ 2) defined as:

2

χ2 χ

=

[D.27]

d

5 The factor ( n − 2) appearing in equation [D.22] is analogous to the factor ( n − 1) of equation [D.4]. Both factors represent the degrees of freedom, which is the number of independent measurements minus the number of parameters calculated

from those measurements. It is not difficult to remember the factors ( n − 1) and ( n − 2) because they are reasonable. In fact, if we collect just one datum regarding the value of a variable, we cannot calculate the standard deviation. Similarly, if we collect only two pairs of data, ( x

) and (

), we can always find a line that passes exactly through both points, and

1 ,y 1

x 2 ,y 2

the least-squares method gives this line. With just two pairs of data, we cannot deduce anything about the reliability of the straight line.

530

Appendix D

Whatever is the number of the degrees of freedom, if we obtain a χ 2 of the order of one or less,

then we have no reason to doubt about our expected distribution or function. On the other hand, if

we obtain a χ 2 much larger than one, our expected distribution or function is unlikely to be correct.

D.9 CONCLUSION

Now it is clear why we say that science is exact. The rigorous treatment of the uncertainties in each

measurement allows us to quantify how much we can trust in our scientific assertions. It is also evi-

dent that due to the random errors, many experiments are unique and unreproducible events. This

statement is especially true when we deal with chaotic systems.

D.10 hinTs for furTher reading

More information about the error analysis in measurements can be found in the book titled An

Introduction to Error Analysis by Taylor (1997), in A Practical Guide to Data Analysis for Physical

Science Students by Lyons (1991), and in the Statistical Treatment of Experimental Data by Young

(1962).

Appendix E: Errors in

Numerical Computation

We are successful because we use the right level of abstraction.

Avi Wigderson (1956 AD–)

Calculations, just like experimental measurements, are subject to errors. In fact, any scientist,

whatever is his/her expertise, introduces inevitable uncertainties not only when he/she performs an

experiment but also when he/she computes using calculators, numerical methods, and algorithms.

It is essential to have a notion of the errors that can be made in the computations.

e.1 roundoff errors

Whatever is used as our computing machine, it can only represent numerical amounts using a limited

number of significant figures. Thus, irrational numbers such as π, Napier’s constant e, 2 cannot be

represented exactly. The discrepancy between the real number and its representation in the computer

is called roundoff error. Any real number N can be written with the floating-point method as

N pbq

=

[E.1]

In [E.1], p is the mantissa, b is the base of the numeration system, and q is called the exponent or the characteristic.1 In a calculator, the space available for the representation of any real number in binary digits is organized as shown in Figure E.1, wherein s represents the sign of the number.

In most of the modern binary calculators ( b = 2), a number N is represented with either of two alternative different precisions, called single and double precision, respectively. In single precision,

32 bits are available to represent N: 23 bits are used to define the significant digits of p, 8 bits are used to store the exponent q, and one bit is used to store the sign s (“0” for positive numbers and “1”

for negative numbers). In double precision, 64 bits are available with 52 digits used to represent the

significant figures. For instance, the number π is represented as 3.141593 in the single precision, and

as 3.141592653589793 in the double precision. Evidently, the use of the double precision reduces

the effects of rounding error. However, it increases the run-time of the computation. There are two

techniques of rounding. Let us assume that the maximum number of digits in the mantissa is n.

The distance between two consecutive mantissas, p

=

+ −

1 and p 2, is equal to b−n: p

p

b n

2

1

. We may

round a mantissa p, included between p 1 and p 2 (see the left part of Figure E.2) by chopping all the digits that are on the right of the n- th digit. Any mantissa p of Figure E.2 is substituted by p 1. The roundoff error is

p p

b n

−

< −

1

[E.2]

An alternative technique of rounding consists in adding ( /

1 )

2 b− n to the mantissa p and then chop the

result at the n-th digit. All the mantissas included in the range ( p

+

−

1, p

/ b n

1

1 2

) are substituted by p 1,

1 The representation of the number N

pbq

=

is defined normalized when b−1 ≤ p < 1. For instance, the representation of

a

2

=

3

=

−

×

1 = 12

(

73

. ) is normalized when a n =

1

0 1273

.

× 10 , whereas that of a 2 (0 00098

.

) is normalized when a 2 n 0 9

. 8 10 .

531

532

Appendix E

s

q

p

FIGURE E.1 Representation of a real number in a calculator: s is the sign, q is the exponent, and p is the mantissa.

p

N

–

–

p

–

–

1

p 2

N 1

N 2

FIGURE E.2 On the left, a true mantissa p is included between the two consecutive mantissas, p 1 and p 2, rounded by the calculator. On the right, a true real number N is included between two consecutive numbers,

N 1 and N 2, rounded by the calculator.

whereas all the mantissas included in the range ( p +1 / 2 b− n

1

, p 2) are substituted by p 2. Therefore, if

we indicate with p the rounded mantissa, the roundoff error is

1

p p

b n

− ≤

− [E.3]

2

If N

pbq

=

is a real number whose rounded version is N

pbq

=

(see the right part of Figure E.2), the

absolute errors in N according to equations [E.2] and [E.3] are:

 bq− n  E

 .2



N − N ≤ 1

[E.4]

bq− n  E

 .3

2

whereas the relative errors are:

−

1 n



N − N

b

 E

 .2



≤ 

[E.5]

N

1 b −1 n  E

 3

. 

2

We define the relative error [E.5] as the accuracy of the calculator (or machine accuracy, MA). It is

equal to either MA b n

= −

1 or MA

1 2 b 1 n

=

−

( / )

depending on which rounding methodology is used in

the calculator, if it is the first or the second one, respectively.

e.2 ProPagaTion of The roundoff error

The arithmetic operations among numbers in the floating-point representation are not exact. For

example, two floating-point numbers are added by first right-shifting the mantissa of the smaller one

and simultaneously increasing its exponent until the two operands have the same exponent. Low-

order bits belonging to the smaller operand are lost by this shifting. If the two operands differ too

much in magnitude, then the smaller operand is effectively replaced by zero, since it is right-shifted

to oblivion. Some of the well-known properties of the four arithmetic operations do not hold in a

calculator. For example, whereas the commutative property for addition and multiplication contin-

ues to hold, the associative property does not.

Roundoff errors accumulate with increasing amounts of calculation. If we perform k arithmetic

operations, and the roundoff errors come in randomly up and down, in the end, we have a total

Appendix E

533

roundoff error of the order of k ( MA), where the square root comes from a random-walk of the

error. This statement is not true when the roundoff errors accumulate preferentially in one direction.

In this case, the final error will be k( MA), grown if it were a snowball.

The finite number of digits available to represent a number restricts the numerical ranges. Some

operations can give rise to phenomena of overflow when the exponent of the output is larger than

its maximum possible value or phenomena of underflow when the exponent of the output is smaller

than its minimum possible value.

e.3 sources of errors in The soluTion of a numerical

Problem by an algoriThm imPlemenTed in a comPuTer

In general, a numerical problem has a certain number of input variables, x , , …, , a certain

1 x 2

xn

number of output variables, y , , …, , and some formulas that describe how the output variables

1 y 2

ym

depend on the input variables. For simplicity, we consider the case of one input and one output vari-

able: y = f ( x).

If x is the true input value, when it is written as a machine number with the floating-point method,

it can be tainted by a roundoff error. It becomes x . The roundoff error is less or equal to the machine

accuracy MA. The uncertainty in the input propagates into the output, according to the propagation

formula we learned in Appendix D:

df ( x)

y − y =

( x − x) +… [E.6]

dx

The relative error is:

y − y

df ( x) x − x

x f ′( x) x − x

x f ′( x)

≈

=

=

( MA) = c( MA) [E.7]

y

dx

y

f ( x)

x

f ( x)

Untangling Complex Systems, page 102

Other author's books: