New in version 0.12.
IEEE754 is a standard for the representation of and computations with floating point numbers in binary systems. It is widely used by floating point implementations in CPUs. These functions implement encoding and decoding binary representations of floating point numbers according to IEEE754.
An IEEE754 binary float consists of three parts: a sign bit, the exponent and the significand
(sometimes called the mantissa). From these parts, the value is then calculated using the
following formula: 1 ^ sign * 2 ^ (exponent  bias) * 1.significand
. The standard defines
multiple binary formats of different sizes that all follow these rules, but differ in
the number of bits allocated for the exponent and significand. The bias for the default
formats is defined as bias = (2 ^ (exponent_bits  1))  1
.
See this article for a more detailed introduction into the subject.
The following binary float formats are defined by the standard:
Name  Also known as  Exponent bits  Significand bits 

binary16 
Half precision  5  10 
binary32 
Single precision  8  23 
binary64 
Double precision  11  52 
binary128 
Quad precision  15  112 
In many programming languages, the binary32
format is available as float
and binary64
is available as double
.
ieee754_encode
(x; exponent_bits; significand_bits[; exponent_bias])¶Encode a floating point number into a IEEE754 binary representation.
Parameters: 


ieee754_decode
(x; exponent_bits; significand_bits[; exponent_bias])¶Calculate the value of an IEEE754 binary float.
Parameters: 


ieee754_half_encode
(x)¶Encode x
in the halfprecision binary format.
ieee754_half_decode
(x)¶Decode the halfprecision binary float x
.
ieee754_single_encode
(x)¶Encode x
in the singleprecision binary format.
ieee754_single_decode
(x)¶Decode the singleprecision binary float x
.
ieee754_double_encode
(x)¶Encode x
in the doubleprecision binary format.
ieee754_double_decode
(x)¶Decode the doubleprecision binary float x
.
ieee754_quad_encode
(x)¶Encode x
in the quadprecision binary format.
ieee754_quad_decode
(x)¶Decode the quadprecision binary float x
.