ASCII: Code Chart, Control Codes & End-of-Line Conventions
ASCII: Code Chart, Control Codes & End-of-Line Conventions
This page explains the ASCII standard — how it encodes characters as numbers, the role of control codes, and how text structure (like newlines) is represented.
What is ASCII
ASCII (American Standard Code for Information Interchange) was developed in the early 1960s to unify how computers represent text. Before ASCII, each manufacturer used its own incompatible encoding.
ASCII defines a mapping between numbers (0–127) and symbols:
- Letters A–Z and a–z
- Digits 0–9
- Punctuation
- Control characters (for formatting and transmission)
H e l l o , w o r l d !
72 101 108 108 111 44 32 119 111 114 108 100 33
Each character corresponds to a 7-bit value, fitting conveniently in one byte with the highest bit unused.
Structure of the ASCII table
| Range | Type | Examples |
|---|---|---|
| 0–31 | Control codes | NUL, LF, CR, TAB |
| 32–47 | Punctuation | Space, !"#$%&'()*+,-./ |
| 48–57 | Digits | 0–9 |
| 58–64 | Punctuation | :;<=>?@ |
| 65–90 | Uppercase letters | A–Z |
| 91–96 | Symbols | [\]^_` |
| 97–122 | Lowercase letters | a–z |
| 123–126 | Symbols | }~ |
| 127 | Control | DEL |
Control characters
ASCII uses the range 0–31 and 127 for non-printable control codes that manage text flow and communication. Common ones still relevant today:
Code | Abbrev | Meaning | Typical use
| ------- | -------- | -----------
8 | BS | Backspace | Move cursor one position left 9 | HT | Horizontal Tab | Indentation or alignment 10 | LF | Line Feed | Move to next line (Unix newline) 13 | CR | Carriage Return | Move to line start (Mac classic) 3 | ETX | End of Text | Interrupt signal (Ctrl-C) 4 | EOT | End of Transmission | End of input (Ctrl-D on Unix) 26 | SUB | Substitute | End of input on Windows (Ctrl-Z)
End-of-line conventions
Different systems use different characters to indicate line endings:
- Unix/Linux/macOS: LF (Line Feed, code 10)
- Old Mac systems: CR (Carriage Return, code 13)
- Windows: CR LF (Carriage Return + Line Feed, codes 13 and 10)
Example:
Unix: Hello\nWorld
Windows: Hello\r\nWorld
Mac OS 9: Hello\rWorld
Many modern editors and libraries handle all formats transparently, but mixed line endings can cause parsing errors.
Historical background: “Carriage Return”
The term originates from typewriters:
- The **carriage** held the paper and moved horizontally while typing.
- Returning the carriage (CR) moved it back to the left margin.
- Feeding the line (LF) rolled the paper up one line.
Teletypes, which inspired early terminals, used the same terminology, which carried over into ASCII and computing.
Limitations of ASCII
- Only 128 symbols — suitable for English, but insufficient for accented or non-Latin scripts.
- Cannot represent characters like ä, é, ñ, ß, α, or й.
- Led to the creation of **extended ASCII** encodings (ISO-8859 series, CP1252, etc.), where values 128–255 were used for regional symbols.
Why ASCII still matters
- ASCII forms the foundation of modern text encoding — all other encodings (UTF-8, Unicode) include it as a subset.
- ASCII characters (0–127) have the same numeric values in UTF-8.
- Most programming languages and file formats assume UTF-8 or ASCII-compatible encodings.
Summary
- ASCII maps 128 characters to numbers 0–127.
- Control codes (0–31, 127) manage structure and transmission.
- Text files differ in line endings across systems.
- ASCII remains the core subset of Unicode and UTF-8.
- Understanding ASCII is essential for debugging, protocol analysis, and encoding conversions.