Americas

  • United States
sandra_henrystocker
Unix Dweeb

Squinting at ASCII on Linux

How-To
Dec 22, 20176 mins
LinuxUbuntu

ASCII plays a much more important role on our systems than generating techno-art. Let's explore the commands that allow you to see how it works.

Back when I started working with computers, understanding the nature of ASCII was exciting. In fact, just knowing how to convert binary to hex was fun.

That was a lot of years ago — berfore ASCII had yet reached drinking age — but character encoding standards are as important as ever today with the internet being so much a part of our business and our personal lives. They’re also more complex and more numerous than you might imagine. So, let’s dive into some of the details of what ASCII is and some of the commands that make it easier to see coding standards in action.

Why ASCII?

ASCII came about to circumvent the problem that different types of electronic systems were storing text in different ways. They all used some form of ones and zeroes (or ONs and OFFs), but the issue of compatibility became important when they needed to interact. So, ASCII was developed primarily to provide encoding consistency. It became a standard in the U.S. in 1960. Initially, ASCII characters used only 7 bits. Some years later, ASCII was extended to use all 8 bits in each byte.

That said, it is important to understand that ASCII, the American Standard Code for Information Interchange is not used on all computers. In fact, most Linux systems today use UTF-8 — a standard closely related to ASCII but not quite identical. In UTF-8, the classic ASCII characters are encoded in 7 bits and characters with greater values use two bytes.

Some of the more important encoding standards in use today include:

  • ASCII — Most widely used for English before 2000
  • UTF-8 — Used in Linux by default along with much of the internet
  • UTF-16 — Used by Microsoft Windows, Mac OS X file systems and others
  • GB 18030 — Used in China (contains all Unicode chars)
  • EUC-JP (Extended Unix Code) — Used in Japan
  • IEC 8859 series — Used for most European languages

According to one source that I describe below, however, there are as many as 1,173 different encoding schemes in use today.

Viewing an ASCII translation table

One of the easiest ways to display an ASCII table on Linux systems is to use the man ASCII or man ascii command. Within the body of the page displayed, you will see a table that starts like this:

       Oct   Dec   Hex   Char                        Oct   Dec   Hex   Char
       ────────────────────────────────────────────────────────────────────────
       000   0     00    NUL '
sandra_henrystocker
Unix Dweeb

Sandra Henry-Stocker has been administering Unix systems for more than 30 years. She describes herself as "USL" (Unix as a second language) but remembers enough English to write books and buy groceries. She lives in the mountains in Virginia where, when not working with or writing about Unix, she's chasing the bears away from her bird feeders.

The opinions expressed in this blog are those of Sandra Henry-Stocker and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.