Floating Point Types are used to store fractional numbers or real numbers. Before we look into the data types present in this group, we need to first understand a few challenges associated with storing fractional numbers in computer’s memory. Take the example of square root of 2 and 8 divided by 3:
√2 = 1.41421356237309....... 8 / 3 = 2.66666666666.......
The results contain an infinite number of digits after the decimal. We can’t store infinite number of digits, after all the computer memory is limited. So, what do we do?
You must have encountered this in your maths class and the general solution here is to round off, maybe keep 2 or 3 or 4 digits after the decimal and discard the rest. In doing so we lose accuracy. Maybe just keeping 3 or 4 digits is fine for the maths assignment but in real-world applications accuracy is quite important. Take space or chip designing fields, here we need a very high degree of accuracy. A tenth of a millimetre is also super important.
To address this issue of accuracy, all hardware and programming languages use the IEEE 754 standard for storing floating point numbers. IEEE 754 proposes a way to store floating point numbers at varying precision levels. The complete details of the standard are beyond the scope of this course. We will just look at an overview of IEEE 754 which should be enough for understanding float and double data types.
IEEE 754 Overview
In IEEE 754 standard, the idea is to compose the fractional number of two parts:
- A significand that contains the number’s digits.
- An exponent that says where the decimal point is placed relative to the significand.
Single and Double Precision Formats
The 2 things of the standard that we are interested in is the single and double precision formats. Single precision format uses a total of 32 bits to represent the fractional number. Out of that 32 bits, 24 are used to represent the significand and 8 bits are used to represent the exponent.
Double precision format uses a total of 64 bits to represent the fractional number. Out of that 53 bits are used to represent the significand and 11 bits are used to represent the exponent.
The key take away about Single and Double precision format is this. Double precision format stores fractional numbers at higher accuracy than single precision format as more bits are used for the exponent compared to single precision.
float
The float data type stores a number in single precision format. It has a size of 32 bits given it stores the number in single precision format. float is useful when you need to store a fractional number with around 6-7 total digits of precision.
Let’s look at a BlueJ program to see float in action.
public class FloatDatatypeDemo
{
public void demoFloat() {
float f1 = 148.7623F;
System.out.println("Value of f1 is " + f1);
float f2 = 148.7623549f;
System.out.println("Value of f2 is " + f2);
}
}
Here is the output of this program
As you can see in the output, we are able to store 148.7623
in f1
without losing any precision. float
can represent 148.7623
accurately as it has 7 digits in total so it is within the range of single precision format that float
uses to store its values.
We lost precision while storing 148.7623549
in f2
. 148.7623549
has 10 digits in total which exceeds the number of digits that float
can represent accurately. So, Java rounded off 148.7623549
to the nearest value that can be accurately represented by float
and stored 148.76236
in f2.
double
The double data type stores a number in double precision format. It has a size of 64 bits. It can store a fractional number with 15-16 total digits of precision.
Building upon our previous program where we stored 148.7623549
in float
and lost precision, this time let's try to store it in a double
.
public class FloatDatatypeDemo
{
public void demoFloat() {
float f1 = 148.7623F;
System.out.println("Value of f1 is " + f1);
double f2 = 148.7623549;
System.out.println("Value of f2 is " + f2);
}
}
The output confirms that we can store 148.7623549
in double
without losing precision.