Lesson Objective
- Understand the purpose of ASCII and Unicode.
- Be able to convert text into binary.
- Be able to explain the term "Character Set".
- Be able to calculate file size of Text.
KS3, GCSE, A-Level Computing Resources
ASCII (pronounced "az-kee" or "ass-key" if American) stands for the American Standard Code for Information Interchange. It serves as a character encoding standard used for electronic communication between computers, telecommunications equipment, and other devices. Here are some key points about ASCII:
Despite being an American standard, ASCII does not include a code point for the cent symbol (¢) or support English terms with diacritical marks (such as résumé and jalapeño) or proper nouns with diacritical marks (such as Beyoncé).
NOTE: Binary values in the table are incorrect. Will fix it later when I have some time.
Binary | Dec | Hex | Char | Binary | Dec | Hex | Char | Binary | Dec | Hex | Char |
---|---|---|---|---|---|---|---|---|---|---|---|
0100000 | 32 | 20 | 1000001 | 64 | 40 | @ | 1100001 | 96 | 60 | ` | |
0100001 | 33 | 21 | ! | 1000010 | 65 | 41 | A | 1100010 | 97 | 61 | a |
0100010 | 34 | 22 | " | 1000011 | 66 | 42 | B | 1100011 | 98 | 62 | b |
0100011 | 35 | 23 | # | 1000100 | 67 | 43 | C | 1100100 | 99 | 63 | c |
0100100 | 36 | 24 | $ | 1000101 | 68 | 44 | D | 1100101 | 100 | 64 | d |
0100101 | 37 | 25 | % | 1000110 | 69 | 45 | E | 1100110 | 101 | 65 | e |
0100110 | 38 | 26 | & | 1000111 | 70 | 46 | F | 1100111 | 102 | 66 | f |
0100111 | 39 | 27 | ' | 1001000 | 71 | 47 | G | 1101000 | 103 | 67 | g |
0101000 | 40 | 28 | ( | 1001001 | 72 | 48 | H | 1101001 | 104 | 68 | h |
0101001 | 41 | 29 | ) | 1001010 | 73 | 49 | I | 1101010 | 105 | 69 | i |
0101010 | 42 | 2A | * | 1001011 | 74 | 4A | J | 1101011 | 106 | 6A | j |
0101011 | 43 | 2B | + | 1001100 | 75 | 4B | K | 1101100 | 107 | 6B | k |
0101100 | 44 | 2C | , | 1001101 | 76 | 4C | L | 1101101 | 108 | 6C | l |
0101101 | 45 | 2D | - | 1001110 | 77 | 4D | M | 1101110 | 109 | 6D | m |
0101110 | 46 | 2E | . | 1001111 | 78 | 4E | N | 1101111 | 110 | 6E | n |
0101111 | 47 | 2F | / | 1010000 | 79 | 4F | O | 1110000 | 111 | 6F | o |
0110000 | 48 | 30 | 0 | 1010001 | 80 | 50 | P | 1110001 | 112 | 70 | p |
0110001 | 49 | 31 | 1 | 1010010 | 81 | 51 | Q | 1110010 | 113 | 71 | q |
0110010 | 50 | 32 | 2 | 1010011 | 82 | 52 | R | 1110011 | 114 | 72 | r |
0110011 | 51 | 33 | 3 | 1010100 | 83 | 53 | S | 1110100 | 115 | 73 | s |
0110100 | 52 | 34 | 4 | 1010101 | 84 | 54 | T | 1110101 | 116 | 74 | t |
0110101 | 53 | 35 | 5 | 1010110 | 85 | 55 | U | 1110110 | 117 | 75 | u |
0110110 | 54 | 36 | 6 | 1010111 | 86 | 56 | V | 1110111 | 118 | 76 | v |
0110111 | 55 | 37 | 7 | 1011000 | 87 | 57 | W | 1111000 | 119 | 77 | w |
0111000 | 56 | 38 | 8 | 1011001 | 88 | 58 | X | 1111001 | 120 | 78 | x |
0111001 | 57 | 39 | 9 | 1011010 | 89 | 59 | Y | 1111010 | 121 | 79 | y |
0111010 | 58 | 3A | : | 1011011 | 90 | 5A | Z | 1111011 | 122 | 7A | z |
0111100 | 59 | 3B | ; | 1011100 | 91 | 5B | [ | 1111100 | 123 | 7B | { |
0111101 | 60 | 3C | < | 1011101 | 92 | 5C | \ | 1111101 | 124 | 7C | | |
0111110 | 61 | 3D | = | 1011110 | 93 | 5D | ] | 1111110 | 125 | 7D | } |
0111111 | 62 | 3E | > | 1011111 | 94 | 5E | ^ | 1111111 | 126 | 7E | ~ |
1000000 | 63 | 3F | ? | 1100000 | 95 | 5F | _ | 1111111 | 127 | 7F | DEL |
8-bit ASCII, also known as Extended ASCII, builds upon the original American Standard Code for Information Interchange (ASCII) system. To enhance its foundational capabilities, 8-bit ASCII includes 8 binary digits (or bits) for each character.
ASCII represents characters using 7 bits (128 code points). However, 8-bit ASCII extends this to 256 characters by utilizing 8 bits per character.
The additional bit allows for a broader range of characters, including special symbols, accented letters, and other language-specific characters.
In summary, 8-bit ASCII enhances the original character encoding by allowing more characters and symbols, making it versatile for different contexts.
A Spooky Ghost _,.--. .' `-. / O O \ | / | / | / \ / `.__.' |
A Cat /\_/\ ( o.o ) > ^ < / --- \ / \ / \ |
???? /\ / \ / o o \ / ^ \ / \ /_/-\___\_\ |
An Apple ,--./,-. / # \ | | \ / `._,._,' |
Unicode, formally known as The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium. Its purpose is to support the use of text written in all of the world's major writing systems.
Unicode assigns a unique number to every character, regardless of the platform, program, or language. Before Unicode, various character encodings existed, each with limitations. These early encoding methods could not cover all languages and often conflicted with one another. Unicode changed this by providing a consistent way to represent characters across different languages.
Unicode uses 16 bits to represent characters.
Here are examples characters in the Unicode Character Set: