Lesson 2. Advanced Binary Arithmetic
Lesson Objective
 Know the difference between unsigned and signed binary (signed and magnitude).
 Understand methods used to add and subtract binary integers.
 Be able to represent and normalise floating point numbers.
 Be able to carry out floating point arithmetic.
Lesson Notes
Signed Binary Numbers
Uses a sign bit to represent positive and negative numbers. They are more versatile than unsigned binary numbers, but unsigned binary numbers are often used when performance is critical.
Unsigned Binary Numbers
Does not have a sign bit, so they can only represent positive numbers.
Sign and Magnitude System
8 bit Signed and Magnitude format.
In this system. Using 8 bits will mean the largest number you can represent is 127. The smallest value would be 127. The most significant bit/value is used to represent a + (0) or  (1).
/+ 
64 
32 
16 
8 
4 
2 
1 

0 
0 
1 
0 
0 
0 
1 
1 
35_{10} 
1 
0 
1 
0 
0 
0 
1 
1 
35_{10} 
Two's Complement Notation
Can be represented using Signed Binary Values. The most significant bit/value is represented as a minus () number. The total amount of numbers you can assign (when using 8 bits) will still remain as 10^{8} (256(128 to 127)). The highest positive assignable value would be 127.
128 
64 
32 
16 
8 
4 
2 
1 

0 
0 
1 
0 
0 
0 
1 
1 
35_{10} 
1 
1 
0 
1 
1 
1 
0 
1 
35_{10} 
Two's Complement Negative Number Conversion Examples
Let's say we want to represent 5 in 8bit two's complement:
 Positive version: 5 in binary is 00000101.
 One's complement: Flipping the bits gives 11111010.
 Two's complement: Add 1 to get 11111011.
So, 11111011 is how 5 is represented in 8bit two's complement.
Here is an example of converting the number 25 to 25.
128 
64 
32 
16 
8 
4 
2 
1 

0 
0 
0 
1 
1 
0 
0 
1 
25_{10} 
1 
1 
1 
0 
0 
1 
1 
0 
Flip 
0 
0 
0 
0 
0 
0 
0 
1 
Add 1 
1 
1 
1 
0 
0 
1 
1 
1 
25_{10} 
Highest and Lowest Number
Unsigned binary the minimum and maximum values for a given number of bits, n, are 0 and 2n 1 respectively.
An 8 bit binary number ranges between (0_{10}  255_{10})
128 
64 
32 
16 
8 
4 
2 
1 

0 
0 
0 
0 
0 
0 
0 
0 
0_{10} 
1 
1 
1 
1 
1 
1 
1 
1 
255_{10} 
Most significant bit for 8 bits = 128
Zero is a positive number.
8 bits can be used to store 256 values. 255 is highest positive value.
Binary Addition  THE RULES!!
Work right to left and apply these simple rules:
 0 + 0 = 0
 0 + 1 = 1
 1 + 0 = 1
 1 + 1 = 0 Carry 1
 1 + 1 + 1 = 1 Carry 1
Here are examples of two 8 bit numbers being added together:




1 
1 
1 




0 
0 
0 
0 
1 
1 
1 
0 
14 
+ 
1 
0 
1 
0 
0 
0 
1 
0 
162 

1 
0 
1 
1 
0 
0 
0 
0 
176 
...

1 



1 
1 
1 



0 
1 
0 
0 
0 
1 
1 
1 
71 
+ 
0 
1 
1 
0 
0 
0 
0 
1 
79 

1 
0 
1 
0 
1 
0 
0 
0 
150 
Overflow Error
When and extra bit is created to represent a number.
Here is an example of an overflow error:
1 


1 
1 
1 
1 




1 
1 
0 
0 
1 
1 
0 
0 
204 
+ 
1 
0 
0 
1 
1 
1 
0 
1 
157 
1 
0 
1 
1 
0 
1 
0 
0 
1 
361 
Adding 3 Binary Numbers
Method 1.
Add the first two, then add the third to the result.
We will carry out the addition 1011 + 0111 + 101
We can can see the 1011 (11) + 0111 (7) + 101 (5) = 23

1 
1 
1 
1 





1 
0 
1 
1 
11 


+ 
0 
1 
1 
1 
7 


1 
0 
0 
1 
0 



+ 

1 
0 
1 
5 


1 
0 
1 
1 
1 
23 

Method 2.
Logical Binary Shifts
Left Shift = Multiply. Each shift is the number multiplied by a power of 2
0 Shift 
0 
0 
0 
0 
1 
0 
0 
0 
Original 
1 Shift 
0 
0 
0 
1 
0 
0 
0 
0 
*2 
2 Shift 
0 
0 
1 
0 
0 
0 
0 
0 
*4 
3 Shift 
0 
1 
0 
0 
0 
0 
0 
0 
*8 
Right Shift = Divide. Each shift is the number division by a power of 2
0 Shift 
0 
0 
0 
1 
0 
0 
0 
0 
Original 
1 Shift 
0 
0 
0 
0 
1 
0 
0 
0 
/2 
2 Shift 
0 
0 
0 
0 
0 
1 
0 
0 
/4 
3 Shift 
0 
0 
0 
0 
0 
0 
1 
0 
/8 
Binary Multiplication
How do I multiply with number that aren't 2, 4, 8, 16, 32, 64, 128, 256...?

128 
64 
32 
16 
8 
4 
2 
1 


0 
0 
0 
1 
1 
0 
0 
1 
25 
x 
0 
0 
0 
0 
1 
0 
1 
0 
10 

1 
1 
0 
0 
1 
0 
0 
0 
200 
+ 
0 
0 
1 
1 
0 
0 
1 
0 
50 

1 
1 
1 
1 
1 
0 
1 
0 
250 
Take the first number as the multipler. Multiply the multiplier by each digit of the multiplicand to achieve intermediate products, whose last digit is in the position of the corresponding multiplicand digit. Then add the intermediate values.
Example:
First x8, Left Shift 25 by 3 (11001000)
Then x2, Left Shift 25 by 1 (00110010)
Final add them together (11001000 + 00110010 = 11111010)
Subtracting Numbers
You can carry out subtraction in Binary by using Two's Complement Notation. By adding a positive and negative signed binary number together you can perform a subtraction operation. The example below demonstrates the following operation 25 + 10 = 15.
25_{10} in binary.
128 
64 
32 
16 
8 
4 
2 
1 

0 
0 
0 
1 
1 
0 
0 
1 
25_{10} 
10 being turned into 10.
128 
64 
32 
16 
8 
4 
2 
1 

0 
0 
0 
0 
1 
0 
1 
0 
10_{10} 
1 
1 
1 
1 
0 
1 
0 
1 
Flip 
0 
0 
0 
0 
0 
0 
0 
1 
Add 1 
1 
1 
1 
1 
0 
1 
1 
0 
10_{10} 
25 + 10 using standard binary addition rule.
128 
64 
32 
16 
8 
4 
2 
1 

0 
0 
0 
1 
1 
0 
0 
1 
25_{10} 
1 
1 
1 
1 
0 
1 
1 
0 
10_{10} 
0 
0 
0 
0 
1 
1 
1 
1 
15_{10} 
Fixed Decimal Point Numbers
Using bits to the right of the units column (after a notional point) introduces fractional values.
Fractional values are negative powers of 2.
A fixedpoint binary value uses a specified number of bits where the placement of the binary point is fixed.
For example, in an 8 bit fixedpoint binary value, the binary point could be set between the fourth and fifth bits.
2^{3} 
2^{2} 
2^{1} 
2^{0} 

2^{1} 
2^{2} 
2^{3} 
2^{4} 

8 
4 
2 
1 

1/2 
1/4 
1/8 
1/16 

0 
0 
0 
1 
• 
1 
0 
0 
1 
1.5625_{10} 
Floating Point Binary
A Real number in binary has three parts:
 The Sign: positive or negative number
 Mantissa: the part of a floatingpoint number which represents the significant digits of that number (the value)
 Exponent: is the power the value is raised to (how much the decimal point needs to be shifted)
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
4 
2 
1 
1 
• 
1 
0 
0 
0 
0 
0 
1 
 Sign bit = 1
 Mantissa represented as a Two's Complement Number
 Exponent represented as a Two's Complement Number
Standard Form?
Standard form, also known as scientific notation, is a way of writing very large or very small numbers in a way that makes them easier to read and write. It is based on the idea of using powers of 10 to represent the number. Computers, however, work with binary values. So, instead of multiplying by powers of ten, they use floating point representation.
5,000,000 can be written as 5 x 10^{6}
 Mantissa = 5
 Exponent = 6
 Base = 10
Floating Point  Positive Exponent
By using floating point binary we can increase accuracy of our binary number. It also means we can represent more numbers.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
4 
2 
1 
0 
• 
1 
0 
1 
1 
0 
1 
0 
To store the number in standard form, the first and second digits have to be opposite.
Mantissa and Exponent stored as one number.However, the Mantissa is the number that is displayed. The Exponent represents the position of the floating point.
Example:
 The floating point always starts at the same position.
 Exponent in the example is +2.
 This means the floating point moved 2 places to the right.
 The Base 2 weighting also changes.
Mantissa 
Exponent 
4 
2 
1 

1/2 
1/4 
4 
2 
1 
0 
1 
0 
• 
1 
1 
0 
1 
0 
2 + 0.5 + 0.25 = 2.75
Floating Point  Negative Exponent
In the example below the exponent is 2
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
4 
2 
1 
0 
• 
1 
0 
1 
1 
1 
1 
0 
If the exponent was a negative number the floating point will move to the left.
The Mantissa (🦗) can increase the number of bits to accommodate the floating point.
Example:
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1/64 
4 
2 
1 
0 
• 
0 
0 
1 
0 
1 
1 
1 
1 
0 
1/8(0.125) + 1/32(0.03125) + 1/64(0.015625) = 0.171875
Floating Point  Negative Mantissa and Negative Exponent
In the example below the mantissa is 0.8125 exponent is 2
When both Exponent and Mantissa are negative numbers. Any new binary digit added must be a 1.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
4 
2 
1 
1 
• 
0 
0 
1 
1 
1 
1 
0 
The Mantissa (🦗) result on the right shows the new place digits as 1's
The same principle will apply to positive Mantissas (ignoring the +/ status). If it starts with 0, new place digits will be 0's
Example:
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1/64 
4 
2 
1 
1 
• 
1 
1 
0 
0 
1 
1 
1 
1 
0 
1 + 1/2 + 1/4 + 1/32 + 1/64 = 0.203125
Floating Point  Negative Mantissa and Positive Exponent
The example Exponent is now +4. the Mantissa is 1.125
This means the floating point will need to move 4 spaces to the right.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
8 
4 
2 
1 
1 
• 
0 
0 
1 
0 
1 
0 
0 
The Mantissa (🦗) result on the right shows that 2 extra spaces/digits have been added to the right.
Right of the floating point is the fractional value.
Left of the point would be the standard Base 2 weighting.
Example:
Mantissa 
Exponent 
16 
8 
4 
2 
1 

1/2 
8 
4 
2 
1 
1 
0 
0 
1 
0 
• 
0 
0 
1 
0 
0 
16 + 2 = 14
Fixed vs Floating
Fixed and floating point each have their own advantages and disadvantages in terms of range, precision and the speed of calculation.
Floating point allows a far greater range of numbers using the same number of bits. Very large numbers and very small fractional numbers can be represented. The larger the mantissa, the greater the precision, and the larger the exponent, the greater the range.
Fixedpoint numbers have a limited range, which is determined by the number of bits used to represent them. Fixedpoint numbers have a fixed precision, which means that the same number of digits are always stored, regardless of the value of the number.
Fixed point binary is a simpler system and is faster to process compared to floating point.
Advantages of Fixed Point:
 Faster calculations
 Less hardware required
 Easier to debug
Advantages of Floating Point:
 Wide range
 Variable precision
 Can represent a wider variety of values
The best choice of representation will depend on the specific application. If speed and efficiency is critical, then fixedpoint may be the better choice. If range or precision is critical, then floatingpoint may be the better choice.
Normalisation
There are two main reasons why we need to normalize floating point binary numbers:
 To ensure maximum accuracy. When a floating point number is normalized, the mantissa (the part of the number to the right of the decimal point) is as large as possible without overflowing the number. This means that the number can be represented with the least number of bits, which in turn gives the greatest possible accuracy.
 To ensure uniqueness. When a floating point number is normalized, each unique number has only one possible bit pattern to represent it. This is important for ensuring that floating point operations are performed correctly.
It is the process of moving the binary point of a floating point number to provide the maximum level of precision for a given number of bits.
 To do this for a positive binary number involves removing any leading zeros (0s).
 To do the same for a negative binary number involves removing and leading ones (1s).
 This means that a normalised floating point number must always start as either 0.1 or 1.0.
Normalise a Positive Floating Point Number
Example: 14.125 into a normalised two's complement floating point number with a 10 bit mantissa and a 6 bit exponent.
16 
8 
4 
2 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
0 
1 
1 
1 
0 
• 
0 
0 
1 
0 
0 
8 + 4 + 2 + ⅛(0.125) = 14.125
Positive two's complement numbers most significant bit must be 0. In order to maximise precision the next most significant bit must be 1.
Place the decimal point between the first 0 and 1. Move 4 places to the left.
The value of the exponent is 4 because it would have to be moved 4 places to the right to return to the original number.
The normalised two's complement floating point representation of 14.125 is below.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1/64 
1/128 
1/256 
1/512 
32 
16 
8 
4 
2 
1 
0 
• 
0 
0 
1 
0 
0 
1 
0 
1 
0 
0 
1 
0 
0 
0 
1 
Normalise a Negative Floating Point Number
Example: 45.375 into a normalised two's complement floating point number with a 10 bit mantissa and a 6 bit exponent.
64 
32 
16 
8 
4 
2 
1 

1/2 
1/4 
1/8 
1 
0 
1 
0 
0 
1 
0 
• 
1 
0 
1 
64 + 16 + 2 + ½(0.5) + ⅛(0.125) = 45.375
In a negative two's complement format the most significant bit must be 1.
In order to maximise precision the next most significant bit must be 0.
We must place the decimal point between the first 1 and 0. In this case we need to move the binary point 6 places to the left.
The normalised two's complement floating point representation of 45.375 is below.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1/64 
1/128 
1/256 
1/512 
32 
16 
8 
4 
2 
1 
1 
• 
0 
1 
0 
0 
1 
0 
1 
0 
1 
0 
0 
0 
1 
1 
0 
Normalise a Positive Floating Point Number  Negative Mantissa
Example: 0.46875 into a normalised two's complement floating point number with a 10 bit mantissa and a 6 bit exponent.
16 
8 
4 
2 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1 
1 
1 
1 
1 
• 
1 
0 
0 
0 
1 
16 + 8 + 4 + 2 + 1 + ½(0.5) + 1/32(0.03125) = 0.46875
The decimal point between the first 1 and 0. In a negative number leading 1s can be lost so in this case we need to move the binary point 1 place to the right and lose 5 leading 1s.
The binary point was moved 1 place to the right so the value of the exponent is 1 because it would have to be moved 1 place to the left to return to the original number.
Final number is below:
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1/64 
1/128 
1/256 
1/512 
32 
16 
8 
4 
2 
1 
1 
• 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
1 
1 
1 
1 
1 
Floating Point Arithmetic
 Work out where the binary point goes in each number.
 Line both numbers up on the normal binary number line. The binary points need to be in the same place.
 Now add them as normal.
1. These are the floating point numbers we are adding.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
4 
2 
1 
0 
• 
1 
0 
1 
0 
0 
1 
1 
0 
• 
1 
0 
1 
1 
0 
0 
1 
2. Put the point in the correct place. Using the Exponent.
3. Convert to fixed point. Line up the decimal points. Add 0's in the extra places.
8 
4 
2 
1 

1/2 
1/4 
1/8 
1/16 
0 
1 
0 
1 
• 
0 
0 
0 
0 
0 
0 
0 
1 
• 
0 
1 
1 
0 
4. Normal addition on both numbers.


1 









0 
1 
0 
1 
• 
0 
0 
0 
0 
? 
+ 
0 
0 
0 
1 
• 
0 
1 
1 
0 
? 

0 
1 
1 
0 
• 
0 
1 
1 
0 
? 
5. Normalise the binary number. Move the decimal point 3 spaces to the left. Positive number, msb must be 0, 2nd msb must be 1. Exponent is +3 because the point would have to move 3 spaces to the right to return the original value.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1/64 
1/128 
4 
2 
1 
0 
• 
1 
1 
0 
0 
1 
1 
0 
0 
1 
1 
Floating Point Subtraction
 To subtract a floating point number from another, first convert them both to fixed point.
 Find the two’s complement of the number to be subtracted.
 Add the two numbers.
 Convert result to normalised floating point.
1. These are the floating point numbers we are subtracting.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
4 
2 
1 
0 
• 
1 
1 
0 
0 
0 
1 
0 
0 
• 
1 
0 
1 
0 
0 
1 
1 
2. Put the point in the correct place. Using the Exponent.
3. Convert to fixed point. Line up the decimal points. Add 0's in the extra places.
8 
4 
2 
1 

1/2 
1/4 
1/8 
1/16 
0 
0 
1 
1 
• 
0 
0 
0 
0 
0 
1 
0 
1 
• 
0 
0 
0 
0 
4. Normal subtraction of the numbers. Flip and add one. Digits to the right of the decimal point stay the same.

0 
1 
0 
1 
• 
0 
0 
0 
0 
5 

1 
0 
1 
0 
• 
0 
0 
0 
0 

+ 
0 
0 
0 
1 
• 






1 
0 
1 
1 
• 
0 
0 
0 
0 
5 
5. Add the positive and negative number.


1 









0 
0 
1 
1 
• 
0 
0 
0 
0 
3 
+ 
1 
0 
1 
1 
• 
0 
0 
0 
0 
5 

1 
1 
1 
0 
• 
0 
0 
0 
0 
2 
6. Normalise the numbers.
Mantissa 
Exponent 
1 

1/2 
1/4 
1/8 
1/16 
1/32 
1/64 
1/128 
4 
2 
1 
1 
• 
0 
0 
0 
0 
0 
0 
0 
0 
1 
0 
Underflow and Overflow
Overflow occurs when the result of a calculation is too large to be held in the number of bits allocated.
For example, adding two integers in an 8bit byte (ignore the sign bit).
1 





1 
1 



1 
0 
0 
0 
0 
0 
0 
1 
129 
+ 
1 
0 
0 
0 
0 
0 
1 
1 
131 
1 
0 
0 
0 
0 
0 
1 
0 
0 
260 
Underflow occurs when a number is too small to be represented in the number of bits allocated.
It may occur if a very small number is divided by a number greater than 1.
Example: Shows a 8 bit binary number shifting one space to the right.

1 
0 
0 
0 
0 
0 
0 
1 

÷2 
0 
1 
0 
0 
0 
0 
0 
0 
1 