Untangling Complex Systems, page 102
B
∑
2
( − − i ) = 0
∂
i=1
From the equations [D.19], wherein wi = 1 2
/σ y , we achieve that the least-squares estimates of the
i
coefficients A and B are:
n
n
n
n
2
wixi
∑
wiyi
yi
1
∑
−
wixi
1
∑
wixi
1
∑
A
i=
i=
i=
i
=
=1
[D.20]
n
n
n
2
2
wi
∑ wixi
1
∑
−
wixi
1
∑
i=
i=
i
=1
n
n
n
n
wi
∑ wixiyi
1
∑
−
wixi
1
∑
wiyi
1
∑
B
i=
i=
i=
i
=
=1
[D.21]
n
n
2
2
n
∑ i
w
i
w xi
i
w xi
i 1
∑
−
i 1
∑
=
=
i=1
When all the uncertainties in the y values are equal, the formula for the determination of A and B
i
are like equations [D.20] and [D.21] but with all the weights wi = 1. It may happen that we collect
each pair of points ( x , y ) just once, and we estimate only their systematic uncertainty due to the
i
i
equipment. In such cases, we still may use equations [D.20] and [D.21] to evaluate the A and B coef-
ficients. Moreover, we may also determine a random-type uncertainty σ in the data ( y
)
y
1 , y 2 , …, yn
by the following formula:
n
1
σ
2 [D.22]
2 ∑(
)
y =
i
y − A − Bxi
n − i=1
4 The procedure of minimizing the sum of the squares appearing in the exponent χ 2 gives the name to the method: the least-squares fitting.
Appendix D
529
Equation [D.22] is based on the assumption that each y is normally distributed about its true value
i
A + Bx with width parameter .5 After having found the uncertainty in the measured quantities i
σy
σy
( y
), we can calculate the uncertainties in
1 , y 2 , …, y
A and B by the error propagation formula. The
n
result is:
n x 2 i
∑
σ
i=1
A = σ y
[D.23]
n
n
2
n
2
xi
∑ − xi
1
∑
i=
i
=1
n
σ B = σ y
[D.24]
n
n
2
n
2
xi
∑ − xi
1
∑
i=
i
=1
The extent to which a set of points ( x
), (
), …, (
), supports a linear relation between the
1 , y 1
x 2 , y 2
x , y
n
n
variables x and y is measured by the correlation coefficient r
n
∑ ( x )( )
i − x
yi − y
σ
r
i=1
xy
=
=
[D.25]
n
n
∑
1/2
(
σ σ
x
2
2
x
)
y
1
∑ ( )
i − x
yi − y
i=
i
=1
The possible values of the correlation coefficients are in the range −1 ≤ r ≤ +1. If the points are cor-
related, σ xy ≈ σ σ
x y and r ≈
1. On the other hand, if the points are uncorrelated, then σ xy ≈ 0 and
r ≈
0.
d.8 TesT χ 2
When we collect data about the value of a variable or a parameter, and we assume to describe them
through a specific function or distribution, we might need a parameter that quantifies the agreement
between the observed and the expected values. Such parameter exists, and it is the χ2:
n
observed value − expected value 2
χ2 =
∑
i
i
[D.26]
standard deviation
i 1
=
If the agreement is good, χ2 will be of the order of the number of degrees of freedom ( d), i.e., the number n of the collected data minus the number of the parameters calculated through the data. If
the agreement is poor, χ2 will be much larger than d. A slightly more convenient way to think about
this test is to introduce the reduced chi-squared ( χ 2) defined as:
2
χ2 χ
=
[D.27]
d
5 The factor ( n − 2) appearing in equation [D.22] is analogous to the factor ( n − 1) of equation [D.4]. Both factors represent the degrees of freedom, which is the number of independent measurements minus the number of parameters calculated
from those measurements. It is not difficult to remember the factors ( n − 1) and ( n − 2) because they are reasonable. In fact, if we collect just one datum regarding the value of a variable, we cannot calculate the standard deviation. Similarly, if we collect only two pairs of data, ( x
) and (
), we can always find a line that passes exactly through both points, and
1 ,y 1
x 2 ,y 2
the least-squares method gives this line. With just two pairs of data, we cannot deduce anything about the reliability of the straight line.
530
Appendix D
Whatever is the number of the degrees of freedom, if we obtain a χ 2 of the order of one or less,
then we have no reason to doubt about our expected distribution or function. On the other hand, if
we obtain a χ 2 much larger than one, our expected distribution or function is unlikely to be correct.
D.9 CONCLUSION
Now it is clear why we say that science is exact. The rigorous treatment of the uncertainties in each
measurement allows us to quantify how much we can trust in our scientific assertions. It is also evi-
dent that due to the random errors, many experiments are unique and unreproducible events. This
statement is especially true when we deal with chaotic systems.
D.10 hinTs for furTher reading
More information about the error analysis in measurements can be found in the book titled An
Introduction to Error Analysis by Taylor (1997), in A Practical Guide to Data Analysis for Physical
Science Students by Lyons (1991), and in the Statistical Treatment of Experimental Data by Young
(1962).
Appendix E: Errors in
Numerical Computation
We are successful because we use the right level of abstraction.
Avi Wigderson (1956 AD–)
Calculations, just like experimental measurements, are subject to errors. In fact, any scientist,
whatever is his/her expertise, introduces inevitable uncertainties not only when he/she performs an
experiment but also when he/she computes using calculators, numerical methods, and algorithms.
It is essential to have a notion of the errors that can be made in the computations.
e.1 roundoff errors
Whatever is used as our computing machine, it can only represent numerical amounts using a limited
number of significant figures. Thus, irrational numbers such as π, Napier’s constant e, 2 cannot be
represented exactly. The discrepancy between the real number and its representation in the computer
is called roundoff error. Any real number N can be written with the floating-point method as
N pbq
=
[E.1]
In [E.1], p is the mantissa, b is the base of the numeration system, and q is called the exponent or the characteristic.1 In a calculator, the space available for the representation of any real number in binary digits is organized as shown in Figure E.1, wherein s represents the sign of the number.
In most of the modern binary calculators ( b = 2), a number N is represented with either of two alternative different precisions, called single and double precision, respectively. In single precision,
32 bits are available to represent N: 23 bits are used to define the significant digits of p, 8 bits are used to store the exponent q, and one bit is used to store the sign s (“0” for positive numbers and “1”
for negative numbers). In double precision, 64 bits are available with 52 digits used to represent the
significant figures. For instance, the number π is represented as 3.141593 in the single precision, and
as 3.141592653589793 in the double precision. Evidently, the use of the double precision reduces
the effects of rounding error. However, it increases the run-time of the computation. There are two
techniques of rounding. Let us assume that the maximum number of digits in the mantissa is n.
The distance between two consecutive mantissas, p
=
+ −
1 and p 2, is equal to b−n: p
p
b n
2
1
. We may
round a mantissa p, included between p 1 and p 2 (see the left part of Figure E.2) by chopping all the digits that are on the right of the n- th digit. Any mantissa p of Figure E.2 is substituted by p 1. The roundoff error is
p p
b n
−
< −
1
[E.2]
An alternative technique of rounding consists in adding ( /
1 )
2 b− n to the mantissa p and then chop the
result at the n-th digit. All the mantissas included in the range ( p
+
−
1, p
/ b n
1
1 2
) are substituted by p 1,
1 The representation of the number N
pbq
=
is defined normalized when b−1 ≤ p < 1. For instance, the representation of
a
2
=
3
=
−
×
1 = 12
(
73
. ) is normalized when a n =
1
0 1273
.
× 10 , whereas that of a 2 (0 00098
.
) is normalized when a 2 n 0 9
. 8 10 .
531
532
Appendix E
s
q
p
FIGURE E.1 Representation of a real number in a calculator: s is the sign, q is the exponent, and p is the mantissa.
p
N
–
–
p
–
–
1
p 2
N 1
N 2
FIGURE E.2 On the left, a true mantissa p is included between the two consecutive mantissas, p 1 and p 2, rounded by the calculator. On the right, a true real number N is included between two consecutive numbers,
N 1 and N 2, rounded by the calculator.
whereas all the mantissas included in the range ( p +1 / 2 b− n
1
, p 2) are substituted by p 2. Therefore, if
we indicate with p the rounded mantissa, the roundoff error is
1
p p
b n
− ≤
− [E.3]
2
If N
pbq
=
is a real number whose rounded version is N
pbq
=
(see the right part of Figure E.2), the
absolute errors in N according to equations [E.2] and [E.3] are:
bq− n E
.2
N − N ≤ 1
[E.4]
bq− n E
.3
2
whereas the relative errors are:
−
1 n
N − N
b
E
.2
≤
[E.5]
N
1 b −1 n E
3
.
2
We define the relative error [E.5] as the accuracy of the calculator (or machine accuracy, MA). It is
equal to either MA b n
= −
1 or MA
1 2 b 1 n
=
−
( / )
depending on which rounding methodology is used in
the calculator, if it is the first or the second one, respectively.
e.2 ProPagaTion of The roundoff error
The arithmetic operations among numbers in the floating-point representation are not exact. For
example, two floating-point numbers are added by first right-shifting the mantissa of the smaller one
and simultaneously increasing its exponent until the two operands have the same exponent. Low-
order bits belonging to the smaller operand are lost by this shifting. If the two operands differ too
much in magnitude, then the smaller operand is effectively replaced by zero, since it is right-shifted
to oblivion. Some of the well-known properties of the four arithmetic operations do not hold in a
calculator. For example, whereas the commutative property for addition and multiplication contin-
ues to hold, the associative property does not.
Roundoff errors accumulate with increasing amounts of calculation. If we perform k arithmetic
operations, and the roundoff errors come in randomly up and down, in the end, we have a total
Appendix E
533
roundoff error of the order of k ( MA), where the square root comes from a random-walk of the
error. This statement is not true when the roundoff errors accumulate preferentially in one direction.
In this case, the final error will be k( MA), grown if it were a snowball.
The finite number of digits available to represent a number restricts the numerical ranges. Some
operations can give rise to phenomena of overflow when the exponent of the output is larger than
its maximum possible value or phenomena of underflow when the exponent of the output is smaller
than its minimum possible value.
e.3 sources of errors in The soluTion of a numerical
Problem by an algoriThm imPlemenTed in a comPuTer
In general, a numerical problem has a certain number of input variables, x , , …, , a certain
1 x 2
xn
number of output variables, y , , …, , and some formulas that describe how the output variables
1 y 2
ym
depend on the input variables. For simplicity, we consider the case of one input and one output vari-
able: y = f ( x).
If x is the true input value, when it is written as a machine number with the floating-point method,
it can be tainted by a roundoff error. It becomes x . The roundoff error is less or equal to the machine
accuracy MA. The uncertainty in the input propagates into the output, according to the propagation
formula we learned in Appendix D:
df ( x)
y − y =
( x − x) +… [E.6]
dx
The relative error is:
y − y
df ( x) x − x
x f ′( x) x − x
x f ′( x)
≈
=
=
( MA) = c( MA) [E.7]
y
dx
y
f ( x)
x
f ( x)
