Americas

  • United States
sandra_henrystocker
Unix Dweeb

Peering into binary files on Linux

How-To
Jun 28, 20216 mins
Linux

Here are eight Linux commands for looking into binary files and viewing details about what executables are doing when they run.

analyze / inspect / examine / find / research / magnifying glass
Credit: Thinkstock

Any file on a Linux system that isn’t a text file is considered a binary file–from system commands and libraries to image files and compiled programs. But these files being binary doesn’t mean that you can’t look into them. In fact, there are quite a few commands that you can use to extract data from binary files or display their content. In this post, we’ll explore quite a few of them.

file

One of the easiest commands to pull information from a binary file is the file command that identifies files by type. It does this in several ways–by evaluating the content, looking for a “magic number” (file type identifier), and checking the language. While we humans generally judge a file by its file extension, the file command largely ignores that. Notice how it responds to the command shown below.

$ file camper.png
camper.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI),
density 72x72, segment length 16, Exif Standard: [TIFF image data,
little-endian, direntries=11, manufacturer=samsung, model=SM-G935V,
orientation=upper-left, xresolution=164, yresolution=172,
resolutionunit=2, software=GIMP 2.8.18, datetime=2018:04:30 07:56:54,
GPS-Data], progressive, precision 8, 3465x2717, components 3

The file command easily determined that “camper.png” is actually a jpg file, but in this case, it tells us a lot more. This includes the image resolution (3465×2717), the date and time the photo was taken, and details about the image and the cell phone used to take the photo. Not all jpg files will contain all of this data, but file will show you what is available.

Ask about a system binary and the output will look very different.

$ file /bin/date
/bin/date: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV),
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=9ce744916618c6eef6f28ff70a3758675c307fb2, for GNU/Linux
3.2.0, stripped

In this case, we see that the date command is, not surprisingly, an ELF (extensible linking format) file along with some other details.

ldd

The ldd command lists the shared libraries that are used by an executable. The date command uses only a few.

$ ldd /bin/date 
linux-vdso.so.1 (0x00007fff21162000) 
libc.so.6 => /lib64/libc.so.6 (0x00007f2572f45000) 
/lib64/ld-linux-x86-64.so.2 (0x00007f2573141000)

ltrace

The ltrace traces library calls for an executable.

$ ltrace pwd 
getenv("POSIXLY_CORRECT") = nil 
strrchr("pwd", '/') = nil 
setlocale(LC_ALL, "") = "en_US.UTF-8" 
bindtextdomain("coreutils", "/usr/share/locale") = "/usr/share/locale" 
textdomain("coreutils") = "coreutils" 
__cxa_atexit(0x5644cb982120, 0, 0x5644cb985b20, 0x6c69747565726f63) = 0 
getopt_long(1, 0x7fff17badb18, "LP", 0x5644cb985b40, nil) = -1 
getcwd(nil, 0) = "" 
puts("/home/shs"/home/shs 
) = 10 
free(0x5644cbbdf440) =  
__fpending(0x7f18d802a520, 0, 0x5644cb982120, 1) = 0 
fileno(0x7f18d802a520) = 1 
__freading(0x7f18d802a520, 0, 0x5644cb982120, 1) = 0

strace

The strace command traces system calls and is considered a very useful diagnostic, debugging and instructional utility. One unusual thing about it is that it sends its output to stderr (standard error) and the output of the command being traced to stdout (standard out). So, if you want to save the tracing information in a file, use commands like these:

$ strace ls camp* 2>output.txt
camper_10.jpg  camper.jpg  camper.png
$
$ head -8 output.txt
execve("/usr/bin/ls", ["ls", "camper_10.jpg", "camper.jpg", "camper.png"], 0x7ffd7ec34f18 /* 34 vars */) = 0
brk(NULL)                               = 0x5646e6bae000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc3d514cf0) = -1 EINVAL (Invalid argument)
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=61880, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 61880, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb2b67a9000
close(3)                                = 0

hexdump

The hexdump command displays the content of binary files in hexadecimal. With the addition of the -C option, it also provides a character translation, so that we can easily pick out the “magic numbers” that identify the file types – JFIF and ELF in the samples below.

$ hexdump -C camper.jpg | head -5 
00000000 ff d8 ff e0 00 10 4a 46 49 46 00 01 01 01 00 48 |......JFIF.....H| 
00000010 00 48 00 00 ff e1 38 7e 45 78 69 66 00 00 49 49 |.H....8~Exif..II| 
00000020 2a 00 08 00 00 00 0b 00 0f 01 02 00 08 00 00 00 |*...............| 
00000030 92 00 00 00 10 01 02 00 09 00 00 00 9a 00 00 00 |................| 
00000040 12 01 03 00 01 00 00 00 01 00 00 00 1a 01 05 00 |................| 
$ hexdump -C /bin/date | head -5 
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............| 
00000010 03 00 3e 00 01 00 00 00 40 42 00 00 00 00 00 00 |..>.....@B......| 
00000020 40 00 00 00 00 00 00 00 70 9a 01 00 00 00 00 00 |@.......p.......| 
00000030 00 00 00 00 40 00 38 00 0d 00 40 00 1f 00 1e 00 |....@.8...@.....| 
00000040 06 00 00 00 04 00 00 00 40 00 00 00 00 00 00 00 |........@.......| 

This is not unlike the output you would get using the od (octal dump) command, but the display is a little easier to read.

$ od -hc camper.jpg | head -6
0000000    d8ff    e0ff    1000    464a    4649    0100    0101    4800
        377 330 377 340  
sandra_henrystocker
Unix Dweeb

Sandra Henry-Stocker has been administering Unix systems for more than 30 years. She describes herself as "USL" (Unix as a second language) but remembers enough English to write books and buy groceries. She lives in the mountains in Virginia where, when not working with or writing about Unix, she's chasing the bears away from her bird feeders.

The opinions expressed in this blog are those of Sandra Henry-Stocker and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.