Bits and Bytes: Integers

2. Basic Types: Numbers

Bits and Bytes: Integers

As mentioned, the difference between the #\color{#4271ae} {\mathtt{\text{int}}}# and #\color{#4271ae} {\mathtt{\text{float}}}# types is how they are stored in memory. But that is not entirely correct, the key difference is how they're read from memory. At the lowest level, all information stored on your computer consists of zeroes and ones, or bits, a binary datatype, much like booleans. Thus, at the lowest level all data is represented the same. The way how this binary data is interpreted determines what it represents at the highest level; your screen.

There are two distinct integer formats: signed and unsigned integers. All integers are stored in computer memory as binary arrays of a certain length. The length of the array is determined by the specific typing of the integer and your operating system. Signed integers interpret the left-most bit as a negative value, unsigned integers do not and can therefore only be positive.

Unsigned

The conversion from binary to unsigned integer follows the following formula:
\[\sum^{n-1}_{i=0} \textbf{bit}_i \times 2^{i}\]
Here, #n# is the length of the binary array and #\textbf{bit}_i#
is the value of the bit at the #i#th position, starting
from the right at bit #\textbf{bit}_0#

An example of an 8-bit binary array or byte is:

#\text{2}^\text{7}#	#\text{2}^\text{6}#	#\text{2}^\text{5}#	#\text{2}^\text{4}#	#\text{2}^\text{3}#	#\text{2}^\text{2}#	#\text{2}^\text{1}#	#\text{2}^\text{0}#
#\mathtt{0}#	#\mathtt{1}#	#\mathtt{0}#	#\mathtt{1}#	#\mathtt{0}#	#\mathtt{1}#	#\mathtt{0}#	#\mathtt{1}#
0	64	0	16	0	4	0	1

This binary array represents the unsigned integer: #\text{64} + \text{16} + \text{4} + \text{1} = \text{85}#.

#\mathtt{01010101}#

Step 1: Find length of binary array.

#n = 8#

Step 2: Plug the values in the formula.

\[\begin{array}{rcl}
\sum^{n-1}_{i=0} \textbf{bit}_i \times 2^{i} & = & \sum^{7}_{i=0} \textbf{bit}_i \times 2^{i}\\
& = & 1 \times 2^0 + 0 + 1 \times 2^2\\
& & + \ 0 + 1 \times 2^4 + 0 \ +\\
& & 2^6\\
& = & 1 + 4+ 16 + 64\\
& = & 85
\end{array}\]

Signed

The conversion from binary to signed integer follows the following formula:
\[(-2^{n-1}) \times {\textbf{bit}_{n-1}} + \sum^{n-2}_{i=0} \textbf{bit}_i \times 2^{i}\]
Again, #n# is the length of the binary array and #\textbf{bit}_i#
is the value of the bit at the #i#th position, starting
from the right at bit #\textbf{bit}_0#

The main difference with an unsigned
interpretation is that the left-most bit now represents a negative power of 2.

An example of another 8-bit binary array:

#\text{-2}^\text{7}#	#\text{2}^\text{6}#	#\text{2}^\text{5}#	#\text{2}^\text{4}#	#\text{2}^\text{3}#	#\text{2}^\text{2}#	#\text{2}^\text{1}#	#\text{2}^\text{0}#
#\mathtt{1}#	#\mathtt{0}#	#\mathtt{1}#	#\mathtt{0}#	#\mathtt{1}#	#\mathtt{0}#	#\mathtt{1}#	#\mathtt{0}#
-128	0	32	0	8	0	2	0

This binary array represents the signed integer: #\text{-128} + \text{32} + \text{8} + \text{2} = \text{-86}#.

#\mathtt{10101010}#

Step 1: Find length of binary array.

#n = 8#

Step 2: Find the value of the sign-bit at position #n-1=7#.

#\textbf{bit}_{n-1} = \textbf{bit}_7 = 1#

Step 3: Plug the values in the formula.

#(-2^{n-1}) \times {\textbf{bit}_{n-1}} + \sum^{n-2}_{i=0} \textbf{bit}_i \times 2^{i}#
\[\begin{array}{rcl}
= & (-2^7) \times 1 + \sum^{6}_{i=0} \textbf{bit}_i \times 2^{i} \\
= & -128 + 0 + 1 \times 2^5 + 0 \ +\\
& 1 \times 2^3 + 0 + 1 \times 2^2 + 0\\
=& -128 + 32 + 8 + 2\\
=& -86
\end{array}\]

The basic #\color{#4271ae} {\mathtt{\text{int}}}# typing in Python is signed, but scaleable; the length changes to accomodate for bigger numbers. You can try this yourself by entering some numbers bigger than #2^{31}# and #2^{63}#.

Now that we know how to convert from binary to integer, we can also define the range of numbers integer formats describe. This is done by setting all bits equal to #1# for unsigned integers. For signed integers you find the minimum by setting only the sign-bit to #1# and the maximum by setting all the bits except for the sign bit to #1#.

For example, to find the range the #\mathtt{\text{uint8}}# (#\mathtt{\text{u}}#nsigned #\mathtt{\text{int}}#eger #\mathtt{\text{8}}#-bit) format describes:

Finding the range of the #\mathtt{\text{uint8}}# format

Step 1: Find the maximum number by converting #\mathtt{11111111}# to integer.

The maximum is #2^7 + 2^6 + 2^5 + 2^4+2^3+2^2+2^1+2^0 = 255#.

Step 2: Find the minimum.

The minimum for unsigned integers is #\mathtt{00000000} \rightarrow 0#.

Thus, the #\mathtt{\text{uint8}}# format describes all integers in the range #[0, 255]#.

You might have noticed that #255# is awfully close to #256# or #2^8#. That's a good observation, the range an integer format containing #n# bits can describe is more easily formalized as:

\[\begin{array}{rcl}
[-2^{n-1}, 2^{n-1}-1] & \hspace{1em} & \text{(signed)} \\
[0, 2^{n}-1] & \hspace{1em} & \text{(unsigned)}
\end{array}\]