If you need to count how many of each character is included in a file or phrase, there are some handy commands you can string together to accomplish this along with scripts and aliases that can make the job easy. Determining how many characters are in a file is easy on the Linux command line: use the ls -l command. On the other hand, if you want to get a count of how many times each character appears in your file, you’re going to need a considerably more complicated command or a script. This post covers several different options. Counting how many times each character appears in a file To count how many of each character are included in a file, you need to string together a series of commands that will consider each character and use a sort command before it counts how many of each character are included. To do that, you can use a command like this one: $ cat myfile | sed 's/(.)/n1/g' | sort | uniq -c | column 24 58 c 112 i 132 o 7 T 254 2 C 3 I 2 O 30 u 1 ' 50 d 4 j 29 p 23 v 25 , 163 e 5 k 1 P 9 w 20 . 2 E 60 l 2 q 4 x 142 a 21 f 48 m 90 r 36 y 5 A 16 g 2 M 1 R 3 z 23 b 1 G 117 n 147 s 1 B 51 h 1 N 119 t The sed command will separate the file into a single character chunks. That output is then sorted by the sort command. After that, each group of the same character is counted by the uniq -c command and the column command is used to create the multi-column output. Since the results are based on the file content, no characters are listed besides those in the file. Notice that the output displays the list of characters in the selected file in alphanumeric order thanks to the sort command. The first two characters aren’t shown because linefeeds and spaces are only recognizable in context. If you want to display the characters in frequency order instead, all you need to do is add a second sort command using the -g (general numeric). $ cat myfile | sed 's/(.)/n1/g' | sort | uniq -c | sort -g | column 1 ' 2 O 9 w 30 u 117 n 1 B 2 q 16 g 36 y 119 t 1 G 3 I 20 . 48 m 132 o 1 N 3 z 21 f 50 d 142 a 1 P 4 j 23 b 51 h 147 s 1 R 4 x 23 v 58 c 163 e 2 C 5 A 24 60 l 254 2 E 5 k 25 , 90 r 2 M 7 T 29 p 112 i To reverse the listing to show the most frequently used characters first, add an r (reverse) option to that last sort command. $ cat myfile | sed 's/(.)/n1/g' | sort | uniq -c | sort -gr | column 254 60 l 24 5 A 2 C 163 e 58 c 23 v 4 x 1 R 147 s 51 h 23 b 4 j 1 P 142 a 50 d 21 f 3 z 1 N 132 o 48 m 20 . 3 I 1 G 119 t 36 y 16 g 2 q 1 B 117 n 30 u 9 w 2 O 1 ' 112 i 29 p 7 T 2 M 90 r 25 , 5 k 2 E The character at the top of the list is, as I assume you guessed, the space character. The second most often used character in the file is an “e”. No surprise there either. In addition, capital letters are listed last since they are not frequently used. Note that if you don’t want to distinguish between uppercase and lowercase letters you can insert a tr (translate) command into the command string like this: $ cat myfile | sed 's/(.)/n1/g' | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -gr | column" 254 115 i 36 y 21 f 3 z 165 e 91 r 30 u 20 . 2 q 147 s 60 l 30 p 17 g 1 ' 147 a 60 c 25 , 9 w 134 o 51 h 24 b 5 k 126 t 50 m 24 4 x 118 n 50 d 23 v 4 j Switch the positions of the “upper” and “lower” arguments to display the results all in uppercase. Counting character-by-character in a word or phrase You can also use a command similar to those shown above to count how many times each letter appears in a single word or phrase. Here’s an example: $ echo "Hello, World!" | sed 's/(.)/n1/g' | sort | uniq -c | sort -gr | column 3 l 1 r 1 d 1 2 o 1 H 1 , 1 1 W 1 e 1 ! Using an alias While the commands shown above are clever, they’re not easy to remember or type. Creating an alias can help with this. Once you decide what form of output you prefer, turn the command into an alias like this: $ alias CountChars="sed 's/(.)/n1/g' | sort | uniq -c | sort -gr | column" Save the alias in your .bashrc file so that you can use it as needed. Then use it in commands like these: $ cat myfile | CountChars 254 60 l 24 5 A 2 C 163 e 58 c 23 v 4 x 1 R 147 s 51 h 23 b 4 j 1 P 142 a 50 d 21 f 3 z 1 N 132 o 48 m 20 . 3 I 1 G 119 t 36 y 16 g 2 q 1 B 117 n 30 u 9 w 2 O 1 ' 112 i 29 p 7 T 2 M 90 r 25 , 5 k 2 E $ echo "Hello, World!" | CountChars 3 l 1 r 1 d 1 2 o 1 H 1 , 1 1 W 1 e 1 ! Using a script If you want to see only alphabetic characters, you can use a script like the one shown below. It first changes all the letters to lowercase before it runs through the alphabet, uses awk to count the number of times each letter appears and then displays the counts only if they’re larger than 1. It only works with whatever string is provided as an argument. #!/bin/bash # make argument all lowercase string=$(echo $1 | tr '[:upper:]' '[:lower:]') for char in {a..z} do count=`awk -F"${char}" '{print NF-1}' Run it like this: $ CountByChar "Hello, World!" d:1 e:1 h:1 l:3 o:2 r:1 w:1 Note that characters will always be listed in alphabetical order. You can pipe the output to the column command if you want fewer lines of output. $ CountByChar "Hello, World!" | column d:1 e:1 h:1 l:3 o:2 r:1 w:1 Wrap-up Whether you’re looking for character counts in files or phrases, there are some handy options available. Turning the complex ones into aliases is probably the best way to make the task easy. Related content how-to Compressing files using the zip command on Linux The zip command lets you compress files to preserve them or back them up, and you can require a password to extract the contents of a zip file. By Sandra Henry-Stocker May 13, 2024 4 mins Linux opinion NSA, FBI warn of email spoofing threat Email spoofing is acknowledged by experts as a very credible threat. By Sandra Henry-Stocker May 13, 2024 3 mins Linux how-to The logic of && and || on Linux These AND and OR equivalents can be used in scripts to determine next actions. By Sandra Henry-Stocker May 02, 2024 4 mins Linux how-to Using the apropos command on Linux By Sandra Henry-Stocker Apr 24, 2024 3 mins Linux PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe