A quick tour for bash
Bash is the default command-line interpreter in most Linux distributions. The principal task of bash is to receive a series of characters that are parsed and interpreted as commands.
To dominate bash is essential to know the underlying concepts and how they can be connected to get faster results through shortcuts. In this post, we are going to make a fast tour of the main components of bash and how they can be used on a daily basis.
Fast navigation
Undoubtedly, one of the more essential tasks in bash is navigation in the filesystem hierarchy. Although the concept of navigation is a simple concept, the subjacent concepts are rarely considered.
Exists two principal concepts in the background of this process:
- Current working directory(CWD). Define the working directory of the process. When you use the
cd
command, in the background the bash process change this internal variable.
Note. The current working directory allow to specify relatives paths in the programs, and it is not a concept attached to bash, in fact all the process in the system has a current working directory variable.
- Root directory. Defines a root directory for the process. It's normal to think that
/
is always the root directory, in reality, all the process has an independent variable that defines the root directory. Is possible to change the root directory of a process with thechroot
command.
When you need to return to the last directory visited, you can use:
cd -
Go to the home directory of the current user? type:
cd
You are an administrator and need to navigate to the steve home directory? type:
cd ~steve
Exists two special paths when you navigate. The .
path makes reference to the current working directory, and ..
to the parent directory. Is important to know that is not possible to make reference to the parent directory of the root directory, the parent directory of the root directory is himself.
Program execution
Running programs undoubtedly takes up much of the time we spend at the terminal and it is essential to have a minimal idea of what happens behind the scenes.
Programs are the static representation of a process. Programs are a set of instructions usually stored in a file that is ready to be executed, while processes are instances of programs that are currently being “executed”.
The reality is that operating systems make use of complex scheduling algorithms that give small slices of time to each process within the operating system in such a fast way that it seems that everything is being executed at the same time when the reality is that only a few processes are being executed at each instant of time, this depending on hardware aspects such the number of cores.
Bash itself is a program that becomes a process when you open a terminal and create an instance of it. In order to get the most out of the terminal, is necessary to be able to repeat old commands.
Note. If you an interested in how bash finds and executes the commands that you type, then you can see this post.
A frequent problem is to repeat the last command executed, to achieve this enter:
!!
Often forgets to add the sudo
and receive a permissions error?. To solve it quickly, we can repeat the previous command with sudo
as a suffix.
sudo !!
You may want to repeat the last executed command that starts with a word, for example, execute the last executed command that started with the word ssh
!ssh
It is very common to require the last argument of the previous command, for example, to create a directory and navigate to it. It is possible to get the last argument of the previous command through $_
, for example:
mkdir -p dir1/dir2/dir3
cd $_
Note. The -p option in the
mkdir
command creates all parent directories in case they do not exist.
Connecting processes
One of the most powerful ideas of bash is the fact of being able to combine commands in order to achieve a result. Generally, the concept of combining several processes is known as pipeline, this due to the similarity with a pipe in which the information is initially entered by a process, every result thrown by this process is sent to the next process to later repeat the cycle and pass its result to the next process.
Mastering the combination and communication between processes is something that takes a lot of time and knowledge of the commands and their options. In this post, we will only show a small sample of the potential of communication between processes and files.
File descriptors
Before delving into the mechanism of communication, each process within the operating system has a list of open files, this list has a limited length depending on the system configuration and conceptually is represented by a list of integers that defines the “descriptor” or file identifier, which is why they are generally named file descriptors.
At the start of any process, it will contain three file descriptors:
- Standard input (stdin). Represented by file descriptor 0, it represents the device or file from which the process input will be originated. It is usually the keyboard.
- Standard output(stdout). Represented by file descriptor 1, it represents the device or file where the normal output of the process will be sent. It is usually the screen.
- Standard error(stderr). Represented by file descriptor 2, it represents the device or file where the error output of the process will be sent. It is usually the screen. Although formally the file descriptor stderr is defined to receive error messages, this is not a must. It is the programmer's decision to send error messages to stdout or stderr, although the idea of sending errors to stderr is generally respected.
Pipes
To start with the mechanism of communications, we going to start with the concept of pipes. Also known as unnamed pipes, they are defined in bash by the symbol |
and informally could be said to pass the stdout of one command to the stdin of the second command.
Formally the pipes supplant the stdout descriptor of the first command, to the stdin descriptor of the second command so that all the output generated by the first command becomes the input of the second.
In the following example, we create a pipeline with 4 commands:
cat file_with_logs.txt | cut -d" " -f1 | sort | head -n 5
cat file_wit_logs.txt
: Displays the contents of the file.cut -d" “ -f1
: Receives the output of thecat
command and cuts the content passed as input into columns and prints the first column of all rows.sort
: Receives the output of thecut
command and sorts in alphabetical order the rows passed as input.head -n 5
: Receives the output of thehead
command and displays the first 5 rows passed as input.
File redirection
Similar to the concept of pipes with the difference that in pipes both ends of the communication are processes, while in file redirection, one end is a file and the other is a process.
program1 | program2 # Pipe (Process to Process)
program1 > file1 # File redirection (Process to file)
program1 < file2 # File redirection (File to Process)
Basically, there are two types of redirection.
- To a file. Expressed by the
>
symbol, it redirects the output of a process to a file. - From a file. Expressed by the
<
symbol, it directs the contents of a file as input to a process.
In the following example, the echo
command will output the string “hello world”
as stdout which will later be redirected by the >
symbol to the file.txt
file.
echo "hello world" > file.txt
Note. This example is going to override the content of the
file.txt
file with the output of theecho
command. If you require to append the existed content of thefile.txt
file with the output of theecho
command, you can use the>>
instead of>
.
On the contrary, we can now make the origin of the communication a file. In this example, we will redirect the content of the file logs.txt
to the grep
command, which will filter its content looking for those lines that contain error
.
grep error < logs.txt
The previous effect could have been achieved simply with grep error logs.txt
or even with cat logs.txt | grep error
, the reality is that there are different ways to express the same idea, which depends on the options that each command support.
Cleaning the output
Something that is really useful is to be able to redirect the stderr to another file so that it is not displayed on the screen. A common way to do this is through the /dev/null
file.
The /dev/null
is a pseudo-device file that is provided by the operating system that will simply ignore all the content that you write to it.
In order to redirect the stderr, just add as a suffix to >
, this represents the file descriptor corresponding to stderr (2).
ls -R 2> /dev/null
Note. The
-R
option in thels
command allows to list the subdirectories recursively.
The above command displays the output of the ls -R
command and hide all the output related to errors, for example, when you intend to list the content of a directory that you don't have the required permissions.
Note. The
/dev/null
file belongs to a family of files provided by the operating system that is known as “virtual device files”. These provide functionality such as the generation of pseudo-random values (/dev/urandom
), generation of an infinite sequence of zeros (/dev/zero
), among others.
Redirection can also occur locally in the descriptors of the same process. If we want stderr to be directed to the same place that stdin, then it is enough to write 2&>1
.
ls -R 2>&1 | grep hello
In this example, both stderr and stdout will be redirected to the same place. This is the default behavior.
Process substitution
Another tool that is provided by bash is the Process substitution. This is very useful in cases where it is required to have the output of a command in the form of a file. This is necessary when we have commands that only receive the input data through a file.
For example, if we want to know the hash of the word “hello”
, we could use the sha256sum
command. This command requires an argument with the name of a file that contains the data, for example:
echo “hello” > file.txt
sha256sum file.txt
In this example, you create file.txt
that contains the word hello
. This file is passed as an argument to the sha256sum
command which returns the hash of the content.
But there is an earlier step that requires us to put the contents in a file, this can be shortened with process substitution.
sha256sum <(echo “hello”)
Process substitution basically converts the output of the echo “hello”
command to a file and passes it as an argument to the sha256sum
command. To better understand what is going on, is possible to print what is being returned by <(echo “hello”)
through echo <(echo “hello”)
. You can verify that it is indeed returning a path similar to /dev/fd/63
.
By putting all the pieces together, our original command is actually:
sha256sum /dev/fd/63
What actually happens is that bash is redirecting the output of echo “hello”
to a temporary file managed by the system which in this case is the /dev/fd/63
file, which is finally passed as an argument to the sha256sum
command.
Command substitution
Finally, the last tool we are going to show you is the command substitution. This consists of executing one or several commands and taking its output as part of a larger command.
The command substitution is different from the pipe, as it does not redirect the output to the input of another command, instead, command substitution expands/replaces the subcommand with her output.
Command substitution in bash is made with the use of the $()
or ``
operator. The recommendation is to always use $()
instead of ``
, because the second option can be confused with single quotes.
In the following example, the echo
command print a message in which the $(hostname)
is being substituted with the output of the hostname
command.
Input:
echo “Hello world from computer: $(hostname)”
Output:
Hello world from computer: you_hostname
Note. The
hostname
command simply returns the name assigned to the current host or machine.
Command substitution can also be mixed with pipelines.
cat “Message: $(ls | tr ‘[\s]’ ‘\n’ | head -n 1)”
Conclusions
Mastering bash is part of a learning process that only comes with practice. In order to get the most out of what has been learned, it’s necessary to start combining the knowledge acquired with the new commands that you will find on your way. Some of the commands that I would recommend you learn are:
- awk
- sed
- cut
- tr
- grep
- Etc
Thanks for reading!
This is all for this post, I hope the content has been to your liking, see you in the next post.