Glossary

Floating-point format#

float16 (Half-precision floating-point): (1 sign, 5 exponent, 10 mantissa) bits

bfloat16 (Brain floating-point 16): (1 sign, 8 exponent, 7 mantissa) bits
(developed by Google for use in deep learning)

float32 (Single-precision floating-point): (1 sign, 8 exponent, 23 mantissa) bits

float64 (Double-precision floating-point): (1 sign, 11 exponent, 52 mantissa) bits

float128 (Quadruple-precision floating-point): (1 sign, 15 exponent, 112 mantissa) bits

float256 (Octuple-precision floating-point): (1 sign, 19 exponent, 236 mantissa) bits