Floating-point numbers

Vocabulary

English	Chinese	Pinyin
floating-point	浮点	fú diǎn
mantissa	尾数	wěi shù
exponent	指数	zhǐ shù
fixed-point	定点	dìng diǎn
normalised	规格化	guī gé huà
precision	精度	jīng dù
rounding errors	舍入误差	shě rù wù chā

Storing decimals in binary

To store reals of very different sizes, computers use floating-point 浮点 — binary scientific notation.
It has two parts, and understanding them explains some famous bugs.
Let's see the format, normalisation, and why 0.1 + 0.2 ≠ 0.3.

The format

A floating-point number has a mantissa 尾数 (the significant digits) and an exponent 指数 (the power of 2). Both are stored as two's complement:

$$\text{number} = \text{mantissa} \times 2^{\text{exponent}}$$

The mantissa is a fixed-point 定点 fraction (binary point after the sign bit).
Worked: mantissa 0.1010000 $= \tfrac12 + \tfrac18 = 0.625$; with exponent 2, the value is $0.625 \times 2^2 = 2.5$.

The place values of an 8-bit mantissa and an 8-bit exponent

Practice

A floating-point number is stored as:

a mantissa multiplied by 2 raised to an exponent
a single integer
a string of digits
a denary fraction

Practice

Match each part of floating-point representation to its role.

the significant digits of the value

the power of 2 to scale by

remove leading zeros for maximum precision

used instead, for exact money values

Mantissa
Exponent
Normalisation
Fixed-point / BCD

Normalisation

A number is normalised 规格化 when the first significant bit sits immediately after the binary point (no wasted leading zeros).
This maximises precision 精度 — every mantissa bit then carries information.
To normalise, shift the mantissa and adjust the exponent until the leading bit is in place; the value is unchanged.

Explore

Normalising a floating-point number

Step through normalisation. Shifting the mantissa to remove wasted leading zeros — and adjusting the exponent to match — keeps the value the same but spends every bit on precision.

Practice

Normalising a floating-point number:

maximises precision by removing wasted leading zeros in the mantissa
changes the value of the number
makes the exponent zero
rounds the number to an integer

Rounding errors 舍入误差

Many denary reals can't be stored exactly in binary — 0.1 is a repeating binary fraction, so it's truncated.
Consequences: rounding errors build up (0.1 + 0.2 ≠ exactly 0.3); never test x = 0.3 — use ABS(x - 0.3) < 1e-9; subtracting nearly-equal values loses precision.
For exact needs (currency), use fixed-point or BCD instead.
Floating-point arithmetic can suffer overflow (exponent too large) and underflow (too small, rounding to zero).

Explore

Build a floating-point number

Flip the mantissa and exponent bits to make a value, and check whether it is normalised.

Practice

0.1 cannot be stored exactly in binary (it is a repeating fraction), so 0.1 + 0.2 does not give exactly 0.3 on a computer.

Practice

For exact money calculations you should use:

fixed-point or BCD instead of floating-point
a larger floating-point mantissa
a faster CPU
more RAM

Sliding the binary point so the first significant bit sits immediately after it (the form 0.1xxx, no wasted leading digits) normalises the mantissa; the number of shifts is the exponent.

You've got it

Key idea

floating-point: $\text{value} = \text{mantissa} \times 2^{\text{exponent}}$ (both two's complement)
mantissa = significant digits; exponent = power of 2
normalisation puts the first significant bit just after the point → maximum precision
reals can't always be stored exactly → rounding errors; compare with a tolerance; use fixed-point/BCD for currency

Storing decimals in binary

The format

Normalisation

Normalising a floating-point number

Rounding errors 舍入误差

Build a floating-point number

You've got it

Handout

Log in or create account

Feedback & help