1

mrahmedcomputing

KS3, GCSE, A-Level Computing Resources

Lesson 2. Advanced Binary Arithmetic


Lesson Objective

  • Know the difference between unsigned and signed binary (signed and magnitude).
  • Understand methods used to add and subtract binary integers.
  • Be able to represent and normalise floating point numbers.
  • Be able to carry out floating point arithmetic.

Lesson Notes

Signed Binary Numbers

Uses a sign bit to represent positive and negative numbers. They are more versatile than unsigned binary numbers, but unsigned binary numbers are often used when performance is critical.

Unsigned Binary Numbers

Does not have a sign bit, so they can only represent positive numbers.

Sign and Magnitude System

8 bit Signed and Magnitude format.

In this system. Using 8 bits will mean the largest number you can represent is 127. The smallest value would be -127. The most significant bit/value is used to represent a + (0) or - (1).

-/+ 64 32 16 8 4 2 1
0 0 1 0 0 0 1 1 3510
1 0 1 0 0 0 1 1 -3510

Two's Complement Notation

Can be represented using Signed Binary Values. The most significant bit/value is represented as a minus (-) number. The total amount of numbers you can assign (when using 8 bits) will still remain as 108 (256(-128 to 127)). The highest positive assignable value would be 127.

-128 64 32 16 8 4 2 1
0 0 1 0 0 0 1 1 3510
1 1 0 1 1 1 0 1 -3510

Two's Complement Negative Number Conversion Examples

Let's say we want to represent -5 in 8-bit two's complement:

  1. Positive version: 5 in binary is 00000101.
  2. One's complement: Flipping the bits gives 11111010.
  3. Two's complement: Add 1 to get 11111011.

So, 11111011 is how -5 is represented in 8-bit two's complement.

Here is an example of converting the number 25 to -25.

-128 64 32 16 8 4 2 1
0 0 0 1 1 0 0 1 2510
1 1 1 0 0 1 1 0 Flip
0 0 0 0 0 0 0 1 Add 1
1 1 1 0 0 1 1 1 -2510

Highest and Lowest Number

Unsigned binary the minimum and maximum values for a given number of bits, n, are 0 and 2n -1 respectively.

An 8 bit binary number ranges between (010 - 25510)

128 64 32 16 8 4 2 1
0 0 0 0 0 0 0 0 010
1 1 1 1 1 1 1 1 25510

Most significant bit for 8 bits = 128

Zero is a positive number.

8 bits can be used to store 256 values. 255 is highest positive value.


Binary Addition - THE RULES!!

Work right to left and apply these simple rules:

  1. 0 + 0 = 0
  2. 0 + 1 = 1
  3. 1 + 0 = 1
  4. 1 + 1 = 0 Carry 1
  5. 1 + 1 + 1 = 1 Carry 1

Here are examples of two 8 bit numbers being added together:

1 1 1
0 0 0 0 1 1 1 0 14
+ 1 0 1 0 0 0 1 0 162
1 0 1 1 0 0 0 0 176

...

1 1 1 1
0 1 0 0 0 1 1 1 71
+ 0 1 1 0 0 0 0 1 79
1 0 1 0 1 0 0 0 150

Overflow Error

When and extra bit is created to represent a number.

Here is an example of an overflow error:

1 1 1 1 1
1 1 0 0 1 1 0 0 204
+ 1 0 0 1 1 1 0 1 157
1 0 1 1 0 1 0 0 1 361

Adding 3 Binary Numbers

Method 1.

Add the first two, then add the third to the result.

We will carry out the addition 1011 + 0111 + 101

We can can see the 1011 (11) + 0111 (7) + 101 (5) = 23

1 1 1 1
1 0 1 1 11
+ 0 1 1 1 7
1 0 0 1 0
+ 1 0 1 5
1 0 1 1 1 23

Method 2.


Logical Binary Shifts

Left Shift = Multiply. Each shift is the number multiplied by a power of 2

0 Shift 0 0 0 0 1 0 0 0 Original
1 Shift 0 0 0 1 0 0 0 0 *2
2 Shift 0 0 1 0 0 0 0 0 *4
3 Shift 0 1 0 0 0 0 0 0 *8

Right Shift = Divide. Each shift is the number division by a power of 2

0 Shift 0 0 0 1 0 0 0 0 Original
1 Shift 0 0 0 0 1 0 0 0 /2
2 Shift 0 0 0 0 0 1 0 0 /4
3 Shift 0 0 0 0 0 0 1 0 /8

Binary Multiplication

How do I multiply with number that aren't 2, 4, 8, 16, 32, 64, 128, 256...?

128 64 32 16 8 4 2 1
0 0 0 1 1 0 0 1 25
x 0 0 0 0 1 0 1 0 10
1 1 0 0 1 0 0 0 200
+ 0 0 1 1 0 0 1 0 50
1 1 1 1 1 0 1 0 250

Take the first number as the multipler. Multiply the multiplier by each digit of the multiplicand to achieve intermediate products, whose last digit is in the position of the corresponding multiplicand digit. Then add the intermediate values.

Example:

First x8, Left Shift 25 by 3 (11001000)

Then x2, Left Shift 25 by 1 (00110010)

Final add them together (11001000 + 00110010 = 11111010)


Subtracting Numbers

You can carry out subtraction in Binary by using Two's Complement Notation. By adding a positive and negative signed binary number together you can perform a subtraction operation. The example below demonstrates the following operation 25 + -10 = 15.

2510 in binary.

-128 64 32 16 8 4 2 1
0 0 0 1 1 0 0 1 2510

10 being turned into -10.

-128 64 32 16 8 4 2 1
0 0 0 0 1 0 1 0 1010
1 1 1 1 0 1 0 1 Flip
0 0 0 0 0 0 0 1 Add 1
1 1 1 1 0 1 1 0 -1010

25 + -10 using standard binary addition rule.

-128 64 32 16 8 4 2 1
0 0 0 1 1 0 0 1 2510
1 1 1 1 0 1 1 0 -1010
0 0 0 0 1 1 1 1 1510

Fixed Decimal Point Numbers

Using bits to the right of the units column (after a notional point) introduces fractional values.

Fractional values are negative powers of 2.

A fixed-point binary value uses a specified number of bits where the placement of the binary point is fixed.

For example, in an 8 bit fixed-point binary value, the binary point could be set between the fourth and fifth bits.

23 22 21 20 2-1 2-2 2-3 2-4
-8 4 2 1 1/2 1/4 1/8 1/16
0 0 0 1 1 0 0 1 -1.562510

Floating Point Binary

A Real number in binary has three parts:

  1. The Sign: positive or negative number
  2. Mantissa: the part of a floating-point number which represents the significant digits of that number (the value)
  3. Exponent: is the power the value is raised to (how much the decimal point needs to be shifted)
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 -4 2 1
1 1 0 0 0 0 0 1

Standard Form?

Standard form, also known as scientific notation, is a way of writing very large or very small numbers in a way that makes them easier to read and write. It is based on the idea of using powers of 10 to represent the number. Computers, however, work with binary values. So, instead of multiplying by powers of ten, they use floating point representation.

5,000,000 can be written as 5 x 106

Floating Point - Positive Exponent

By using floating point binary we can increase accuracy of our binary number. It also means we can represent more numbers.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 -4 2 1
0 1 0 1 1 0 1 0

To store the number in standard form, the first and second digits have to be opposite.

Mantissa and Exponent stored as one number.However, the Mantissa is the number that is displayed. The Exponent represents the position of the floating point.

Example:

Mantissa Exponent
-4 2 1 1/2 1/4 -4 2 1
0 1 0 1 1 0 1 0

2 + 0.5 + 0.25 = 2.75

Floating Point - Negative Exponent

In the example below the exponent is -2

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 -4 2 1
0 1 0 1 1 1 1 0

If the exponent was a negative number the floating point will move to the left.

The Mantissa (🦗) can increase the number of bits to accommodate the floating point.

Example:

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 -4 2 1
0 0 0 1 0 1 1 1 1 0

1/8(0.125) + 1/32(0.03125) + 1/64(0.015625) = 0.171875

Floating Point - Negative Mantissa and Negative Exponent

In the example below the mantissa is -0.8125 exponent is -2

When both Exponent and Mantissa are negative numbers. Any new binary digit added must be a 1.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 -4 2 1
1 0 0 1 1 1 1 0

The Mantissa (🦗) result on the right shows the new place digits as 1's

The same principle will apply to positive Mantissas (ignoring the +/- status). If it starts with 0, new place digits will be 0's

Example:

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 -4 2 1
1 1 1 0 0 1 1 1 1 0

-1 + 1/2 + 1/4 + 1/32 + 1/64 = -0.203125

Floating Point - Negative Mantissa and Positive Exponent

The example Exponent is now +4. the Mantissa is -1.125

This means the floating point will need to move 4 spaces to the right.

Mantissa Exponent
-1 1/2 1/4 1/8 -8 4 2 1
1 0 0 1 0 1 0 0

The Mantissa (🦗) result on the right shows that 2 extra spaces/digits have been added to the right.

Right of the floating point is the fractional value.

Left of the point would be the standard Base 2 weighting.

Example:

Mantissa Exponent
-16 8 4 2 1 1/2 -8 4 2 1
1 0 0 1 0 0 0 1 0 0

-16 + 2 = -14

Fixed vs Floating

Fixed and floating point each have their own advantages and disadvantages in terms of range, precision and the speed of calculation.

Floating point allows a far greater range of numbers using the same number of bits. Very large numbers and very small fractional numbers can be represented. The larger the mantissa, the greater the precision, and the larger the exponent, the greater the range.

Fixed-point numbers have a limited range, which is determined by the number of bits used to represent them. Fixed-point numbers have a fixed precision, which means that the same number of digits are always stored, regardless of the value of the number.

Fixed point binary is a simpler system and is faster to process compared to floating point.

Advantages of Fixed Point:

  • Faster calculations
  • Less hardware required
  • Easier to debug

Advantages of Floating Point:

  • Wide range
  • Variable precision
  • Can represent a wider variety of values

The best choice of representation will depend on the specific application. If speed and efficiency is critical, then fixed-point may be the better choice. If range or precision is critical, then floating-point may be the better choice.


Normalisation

There are two main reasons why we need to normalize floating point binary numbers:

  1. To ensure maximum accuracy. When a floating point number is normalized, the mantissa (the part of the number to the right of the decimal point) is as large as possible without overflowing the number. This means that the number can be represented with the least number of bits, which in turn gives the greatest possible accuracy.
  2. To ensure uniqueness. When a floating point number is normalized, each unique number has only one possible bit pattern to represent it. This is important for ensuring that floating point operations are performed correctly.

It is the process of moving the binary point of a floating point number to provide the maximum level of precision for a given number of bits.

Normalise a Positive Floating Point Number

Example: 14.125 into a normalised two's complement floating point number with a 10 bit mantissa and a 6 bit exponent.

-16 8 4 2 1 1/2 1/4 1/8 1/16 1/32
0 1 1 1 0 0 0 1 0 0

8 + 4 + 2 + ⅛(0.125) = 14.125

Positive two's complement numbers most significant bit must be 0. In order to maximise precision the next most significant bit must be 1.

Place the decimal point between the first 0 and 1. Move 4 places to the left.

0 1 1 1 0 0 0 1 0 0

The value of the exponent is 4 because it would have to be moved 4 places to the right to return to the original number.

The normalised two's complement floating point representation of 14.125 is below.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 1

Normalise a Negative Floating Point Number

Example: -45.375 into a normalised two's complement floating point number with a 10 bit mantissa and a 6 bit exponent.

-64 32 16 8 4 2 1 1/2 1/4 1/8
1 0 1 0 0 1 0 1 0 1

-64 + 16 + 2 + ½(0.5) + ⅛(0.125) = -45.375

In a negative two's complement format the most significant bit must be 1.

In order to maximise precision the next most significant bit must be 0.

1 0 1 0 0 1 0 1 0 1

We must place the decimal point between the first 1 and 0. In this case we need to move the binary point 6 places to the left.

The normalised two's complement floating point representation of -45.375 is below.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
1 0 1 0 0 1 0 1 0 1 0 0 0 1 1 0

Normalise a Positive Floating Point Number - Negative Mantissa

Example: -0.46875 into a normalised two's complement floating point number with a 10 bit mantissa and a 6 bit exponent.

-16 8 4 2 1 1/2 1/4 1/8 1/16 1/32
1 1 1 1 1 1 0 0 0 1

-16 + 8 + 4 + 2 + 1 + ½(0.5) + 1/32(0.03125) = -0.46875

The decimal point between the first 1 and 0. In a negative number leading 1s can be lost so in this case we need to move the binary point 1 place to the right and lose 5 leading 1s.

1 1 1 1 1 1 0 0 0 1

The binary point was moved 1 place to the right so the value of the exponent is -1 because it would have to be moved 1 place to the left to return to the original number.

Final number is below:

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
1 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1

Floating Point Arithmetic

1. These are the floating point numbers we are adding.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 -4 2 1
0 1 0 1 0 0 1 1
0 1 0 1 1 0 0 1

2. Put the point in the correct place. Using the Exponent.

0 1 0 1 0 0 1 1

0 1 0 1 1 0 0 1

3. Convert to fixed point. Line up the decimal points. Add 0's in the extra places.

-8 4 2 1 1/2 1/4 1/8 1/16
0 1 0 1 0 0 0 0
0 0 0 1 0 1 1 0

4. Normal addition on both numbers.

1
0 1 0 1 0 0 0 0 ?
+ 0 0 0 1 0 1 1 0 ?
0 1 1 0 0 1 1 0 ?

5. Normalise the binary number. Move the decimal point 3 spaces to the left. Positive number, msb must be 0, 2nd msb must be 1. Exponent is +3 because the point would have to move 3 spaces to the right to return the original value.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 -4 2 1
0 1 1 0 0 1 1 0 0 1 1

Floating Point Subtraction

1. These are the floating point numbers we are subtracting.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 -4 2 1
0 1 1 0 0 0 1 0
0 1 0 1 0 0 1 1

2. Put the point in the correct place. Using the Exponent.

0 1 1 0 0 0 1 0

0 1 0 1 0 0 1 1

3. Convert to fixed point. Line up the decimal points. Add 0's in the extra places.

-8 4 2 1 1/2 1/4 1/8 1/16
0 0 1 1 0 0 0 0
0 1 0 1 0 0 0 0

4. Normal subtraction of the numbers. Flip and add one. Digits to the right of the decimal point stay the same.

0 1 0 1 0 0 0 0 5
1 0 1 0 0 0 0 0
+ 0 0 0 1
1 0 1 1 0 0 0 0 -5

5. Add the positive and negative number.

1
0 0 1 1 0 0 0 0 3
+ 1 0 1 1 0 0 0 0 -5
1 1 1 0 0 0 0 0 -2

6. Normalise the numbers.

Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 -4 2 1
1 0 0 0 0 0 0 0 0 1 0

Underflow and Overflow

Overflow occurs when the result of a calculation is too large to be held in the number of bits allocated.

For example, adding two integers in an 8-bit byte (ignore the sign bit).

1 1 1
1 0 0 0 0 0 0 1 129
+ 1 0 0 0 0 0 1 1 131
1 0 0 0 0 0 1 0 0 260

Underflow occurs when a number is too small to be represented in the number of bits allocated.

It may occur if a very small number is divided by a number greater than 1.

Example: Shows a 8 bit binary number shifting one space to the right.

1 0 0 0 0 0 0 1
÷2 0 1 0 0 0 0 0 0 1

3