A quick tour for bash

Sergio Daniel Cortez Chavez
10 min readDec 21, 2021

--

Photo by Michael Dziedzic on Unsplash

Bash is the default command-line interpreter in most Linux distributions. The principal task of bash is to receive a series of characters that are parsed and interpreted as commands.

To dominate bash is essential to know the underlying concepts and how they can be connected to get faster results through shortcuts. In this post, we are going to make a fast tour of the main components of bash and how they can be used on a daily basis.

Fast navigation

Photo by Jake Givens on Unsplash

Undoubtedly, one of the more essential tasks in bash is navigation in the filesystem hierarchy. Although the concept of navigation is a simple concept, the subjacent concepts are rarely considered.

Exists two principal concepts in the background of this process:

  • Current working directory(CWD). Define the working directory of the process. When you use the cd command, in the background the bash process change this internal variable.

Note. The current working directory allow to specify relatives paths in the programs, and it is not a concept attached to bash, in fact all the process in the system has a current working directory variable.

  • Root directory. Defines a root directory for the process. It's normal to think that/ is always the root directory, in reality, all the process has an independent variable that defines the root directory. Is possible to change the root directory of a process with the chroot command.

When you need to return to the last directory visited, you can use:

cd -

Go to the home directory of the current user? type:

cd

You are an administrator and need to navigate to the steve home directory? type:

cd ~steve

Exists two special paths when you navigate. The . path makes reference to the current working directory, and .. to the parent directory. Is important to know that is not possible to make reference to the parent directory of the root directory, the parent directory of the root directory is himself.

Program execution

Photo by Anna Kolosyuk on Unsplash

Running programs undoubtedly takes up much of the time we spend at the terminal and it is essential to have a minimal idea of what happens behind the scenes.

Programs are the static representation of a process. Programs are a set of instructions usually stored in a file that is ready to be executed, while processes are instances of programs that are currently being “executed”.

The reality is that operating systems make use of complex scheduling algorithms that give small slices of time to each process within the operating system in such a fast way that it seems that everything is being executed at the same time when the reality is that only a few processes are being executed at each instant of time, this depending on hardware aspects such the number of cores.

Bash itself is a program that becomes a process when you open a terminal and create an instance of it. In order to get the most out of the terminal, is necessary to be able to repeat old commands.

Note. If you an interested in how bash finds and executes the commands that you type, then you can see this post.

A frequent problem is to repeat the last command executed, to achieve this enter:

!!

Often forgets to add the sudo and receive a permissions error?. To solve it quickly, we can repeat the previous command with sudoas a suffix.

sudo !!

You may want to repeat the last executed command that starts with a word, for example, execute the last executed command that started with the word ssh

!ssh

It is very common to require the last argument of the previous command, for example, to create a directory and navigate to it. It is possible to get the last argument of the previous command through $_ , for example:

mkdir -p dir1/dir2/dir3
cd $_

Note. The -p option in the mkdir command creates all parent directories in case they do not exist.

Connecting processes

Photo by Scott Webb on Unsplash

One of the most powerful ideas of bash is the fact of being able to combine commands in order to achieve a result. Generally, the concept of combining several processes is known as pipeline, this due to the similarity with a pipe in which the information is initially entered by a process, every result thrown by this process is sent to the next process to later repeat the cycle and pass its result to the next process.

Mastering the combination and communication between processes is something that takes a lot of time and knowledge of the commands and their options. In this post, we will only show a small sample of the potential of communication between processes and files.

File descriptors

Before delving into the mechanism of communication, each process within the operating system has a list of open files, this list has a limited length depending on the system configuration and conceptually is represented by a list of integers that defines the “descriptor” or file identifier, which is why they are generally named file descriptors.

At the start of any process, it will contain three file descriptors:

  1. Standard input (stdin). Represented by file descriptor 0, it represents the device or file from which the process input will be originated. It is usually the keyboard.
  2. Standard output(stdout). Represented by file descriptor 1, it represents the device or file where the normal output of the process will be sent. It is usually the screen.
  3. Standard error(stderr). Represented by file descriptor 2, it represents the device or file where the error output of the process will be sent. It is usually the screen. Although formally the file descriptor stderr is defined to receive error messages, this is not a must. It is the programmer's decision to send error messages to stdout or stderr, although the idea of sending errors to stderr is generally respected.

Pipes

To start with the mechanism of communications, we going to start with the concept of pipes. Also known as unnamed pipes, they are defined in bash by the symbol | and informally could be said to pass the stdout of one command to the stdin of the second command.

Formally the pipes supplant the stdout descriptor of the first command, to the stdin descriptor of the second command so that all the output generated by the first command becomes the input of the second.

In the following example, we create a pipeline with 4 commands:

cat file_with_logs.txt | cut -d" " -f1 | sort | head -n 5
  1. cat file_wit_logs.txt: Displays the contents of the file.
  2. cut -d" “ -f1: Receives the output of the cat command and cuts the content passed as input into columns and prints the first column of all rows.
  3. sort: Receives the output of the cut command and sorts in alphabetical order the rows passed as input.
  4. head -n 5: Receives the output of the head command and displays the first 5 rows passed as input.

File redirection

Photo by Mark König on Unsplash

Similar to the concept of pipes with the difference that in pipes both ends of the communication are processes, while in file redirection, one end is a file and the other is a process.

program1 | program2 # Pipe (Process to Process)
program1 > file1 # File redirection (Process to file)
program1 < file2 # File redirection (File to Process)

Basically, there are two types of redirection.

  1. To a file. Expressed by the > symbol, it redirects the output of a process to a file.
  2. From a file. Expressed by the < symbol, it directs the contents of a file as input to a process.

In the following example, the echo command will output the string “hello world” as stdout which will later be redirected by the > symbol to the file.txt file.

echo "hello world" > file.txt

Note. This example is going to override the content of the file.txt file with the output of the echo command. If you require to append the existed content of the file.txt file with the output of the echo command, you can use the >> instead of >.

On the contrary, we can now make the origin of the communication a file. In this example, we will redirect the content of the file logs.txt to the grep command, which will filter its content looking for those lines that contain error.

grep error < logs.txt

The previous effect could have been achieved simply with grep error logs.txt or even with cat logs.txt | grep error , the reality is that there are different ways to express the same idea, which depends on the options that each command support.

Cleaning the output

Photo by The Creative Exchange on Unsplash

Something that is really useful is to be able to redirect the stderr to another file so that it is not displayed on the screen. A common way to do this is through the /dev/nullfile.

The /dev/null is a pseudo-device file that is provided by the operating system that will simply ignore all the content that you write to it.

In order to redirect the stderr, just add as a suffix to > , this represents the file descriptor corresponding to stderr (2).

ls -R 2> /dev/null

Note. The -R option in the ls command allows to list the subdirectories recursively.

The above command displays the output of the ls -Rcommand and hide all the output related to errors, for example, when you intend to list the content of a directory that you don't have the required permissions.

Note. The /dev/null file belongs to a family of files provided by the operating system that is known as “virtual device files”. These provide functionality such as the generation of pseudo-random values (/dev/urandom), generation of an infinite sequence of zeros (/dev/zero), among others.

Redirection can also occur locally in the descriptors of the same process. If we want stderr to be directed to the same place that stdin, then it is enough to write 2&>1 .

ls -R 2>&1 | grep hello

In this example, both stderr and stdout will be redirected to the same place. This is the default behavior.

Process substitution

Another tool that is provided by bash is the Process substitution. This is very useful in cases where it is required to have the output of a command in the form of a file. This is necessary when we have commands that only receive the input data through a file.

For example, if we want to know the hash of the word “hello”, we could use the sha256sum command. This command requires an argument with the name of a file that contains the data, for example:

echo “hello” > file.txt
sha256sum file.txt

In this example, you create file.txt that contains the word hello . This file is passed as an argument to the sha256sum command which returns the hash of the content.

But there is an earlier step that requires us to put the contents in a file, this can be shortened with process substitution.

sha256sum <(echo “hello”)

Process substitution basically converts the output of the echo “hello” command to a file and passes it as an argument to the sha256sum command. To better understand what is going on, is possible to print what is being returned by <(echo “hello”) through echo <(echo “hello”) . You can verify that it is indeed returning a path similar to /dev/fd/63 .

By putting all the pieces together, our original command is actually:

sha256sum /dev/fd/63

What actually happens is that bash is redirecting the output of echo “hello” to a temporary file managed by the system which in this case is the /dev/fd/63 file, which is finally passed as an argument to the sha256sum command.

Command substitution

Finally, the last tool we are going to show you is the command substitution. This consists of executing one or several commands and taking its output as part of a larger command.

The command substitution is different from the pipe, as it does not redirect the output to the input of another command, instead, command substitution expands/replaces the subcommand with her output.

Command substitution in bash is made with the use of the $() or `` operator. The recommendation is to always use $() instead of `` , because the second option can be confused with single quotes.

In the following example, the echo command print a message in which the $(hostname) is being substituted with the output of the hostname command.

Input:

echo “Hello world from computer: $(hostname)”

Output:

Hello world from computer: you_hostname

Note. The hostname command simply returns the name assigned to the current host or machine.

Command substitution can also be mixed with pipelines.

cat “Message: $(ls | tr ‘[\s]’ ‘\n’ | head -n 1)”

Conclusions

Mastering bash is part of a learning process that only comes with practice. In order to get the most out of what has been learned, it’s necessary to start combining the knowledge acquired with the new commands that you will find on your way. Some of the commands that I would recommend you learn are:

  • awk
  • sed
  • cut
  • tr
  • grep
  • Etc

Thanks for reading!

This is all for this post, I hope the content has been to your liking, see you in the next post.

--

--

No responses yet