Americas

  • United States
sandra_henrystocker
Unix Dweeb

The many faces of the Linux grep command

How-To
Feb 21, 20175 mins
Data CenterLinux

The everlastingly useful grep command can change its character with the flip of a switch to help you find things.

space_telescope
Credit: Jet Propulsion Laboratory/ IDG

The grep command – likely one of the first ten commands that every Unix user comes to know and love – is not just a nice tool for finding a word or phrase in a file or command output. It can take on some vastly different personalities that allow you to more cleverly find the data that you are looking for and has more flexibility than many of its users have discovered.

Historically provided as separate binaries, the different “flavors” of grep are now provided through a number of key command options that change how grep interprets the pattern that you provide for your search. To easily switch from one mode of searching to another, the different grep commands could be set up as aliases such as these:

alias egrep='grep -E'
alias fgrep='grep -F'
alias plgrep='grep -P'

egrep

If you’ve used egrep and fgrep in the past, you’ll find that grep -E and grep -F work as you’d expect. They’re just built into a single executable on most systems today. So, you can use the options or set up the aliases to make using them a bit easier.

With the -E switch, grep uses extended regular expressions. This means that you can provide a string of expressions that you want to match as shown in the example below.

$ egrep "green|yellow|purple" colors
32 = green
35 = purple
42 = green background
45 = purple background
92 = light green
93 = yellow
95 = light purple
102 = light green background
103 = yellow background
105 = light purple background

fgrep

With -F, grep interprets that patterns you provide as fixed strings. This means that it doesn’t interpret any expressions that you specify, but takes them literally. The $ in the command below, for example, is not taken as indicating that some kind of interpretation is needed. Because of this literalism, fgreg (i.e., grep -F) commands tend to run a little faster than oither grep commands.

$ cat txt.txt
$andra
sandra
slee
$  grep -F '$andra' txt.txt
$andra

patterns from a file

Here’s an option that provides some interesting benefits. You can also put a series of literal strings in a file and look for them all using a command as in the example below.

Say we have a list of colors in one file:

$ cat colorlist
green
orange
purple

When we then want to select from another file all of the lines that contain these color names.

$ fgrep -f colorlist colors
32 = green
33 = orange
35 = purple
42 = green background
43 = orange background
45 = purple background
92 = light green
95 = light purple
102 = light green background
105 = light purple backgroun

The -f argument tells grep to get its patterns from the specified file rather than from the command line.

Note that we could have run this particular command without the -F option. The choice depends on what you’re looking for, though the use of -F offers a slight performance increase and no disadvantages when you’re looking for straighht text.

plgrep

Another option that you’ll find with the newer grep implementations is -P which interprets the pattern provided as a Perl regular expression. I’m calling it plgrep (to avoid confusing it with pgrep). In the example below, you can see that we’re only matching color names that include three parts to the color names.

$ pgrep '= S+sS+s' colors
100 = dark grey background
101 = light red background
102 = light green background
104 = light blue background
105 = light purple background

grep with context

Another very handy grep command is one that uses the -A (after) and -B (before) switches to provide some context for your located strings. An alias that shows your discovered line along with the lines that appear before and after it in the file might look like this. I’m calling it “cxgrep” for context grep and to keep it from being confused with grep’s -c switch.

alias cxgrep='grep -B 1 -A 1'

Here’s an example with the colors file:

$ cgrep purple colors
34 = blue
35 = purple
36 = cyan
--
44 = blue background
45 = purple background
46 = cyan background
--
94 = light blue
95 = light purple
96 = turquoise
--
104 = light blue background
105 = light purple background
106 = turquoise background

grep NOT

The option to see only those lines that don’t contain a particular string can also be set up easily as an alias. Though the switch for this is -v, I would be tempted to call it “xgrep” to emphasize that it excludes the specified text.

alias xgrep-'grep -v'

Here’s what the output looks like when we omit lines containing “bakground”:

$ xgrep background colors
0  = default colour
1  = bold
4  = underlined
5  = flashing text
7  = reverse field
31 = red
32 = green
33 = orange
34 = blue
35 = purple
36 = cyan
37 = grey
90 = dark grey
91 = light red
92 = light green
93 = yellow
94 = light blue
95 = light purple
96 = turquoise

cgrep

You can also use grep to count the number of lines your pattern appears on. Note that it doesn’t count more than one appearance per line.

$ grep -c purple colors
4

show last word in each line only

grep -P 'S+$' -o colors

$ grep -P 'S+$' -o colors | sort | uniq
background
blue
bold
colour
cyan
field
green
grey
orange
purple
red
text
turquoise
underlined
yellow

position in the file

$ grep -o -b purple colors
131:purple
271:purple
417:purple
586:purple

The combination of the -o (i.e., only) and the -b (byte position) options results in our only seeing the positions of the word “purple”, but the descriptions they appear in. If we were looking for the stats for the entire lines, we’d omit the -o and see something like this:

126:35 = purple
266:45 = purple background
406:95 = light purple
574:105 = light purple background

While the old grep “family” offered very useful grep capabilities, a larger group of grep aliases can help you make use of many of the command’s powerful options for getting work done quickly and efficiently.

sandra_henrystocker
Unix Dweeb

Sandra Henry-Stocker has been administering Unix systems for more than 30 years. She describes herself as "USL" (Unix as a second language) but remembers enough English to write books and buy groceries. She lives in the mountains in Virginia where, when not working with or writing about Unix, she's chasing the bears away from her bird feeders.

The opinions expressed in this blog are those of Sandra Henry-Stocker and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.