How a Shell Works

Santiago Peña Mosquera
5 min readApr 17, 2020

What is a shell?

First thing’s first, let’s take a look at what a shell is. Well by definition we have: In computing, the term shell is used to refer to those programs that provide a user interface to access the services of the operating system. These can be graphic or plain text, depending on the type of interface they use. Shells are designed to facilitate the way in which the different programs available on the computer are invoked or executed. One of the most common shells is Bash, which stands for “Bourne-again shell”. A shell comes in the form of a command line interface, which is where a user issues commands to the machine using lines of text.

What happens when you type ls -l in the shell

Now, we’re going to get under the hood and show you what’s really going on underneath the neat interface facade. The question at hand is, what happens when you type “ls -l” in your shell?

ls is a commnad used to list the contents of a directory. The -l flag displays it in long format which means additional information, including file permission configurations, size, dates, and times, as we can see in the image below.

But what is really going on under the surface?

When we type ls, the shell wich consit in a program that executes an infinite loop, keeps running, waiting for the user to enter a command, until the end-of-file (EOF) or exit are entered.

User Input

For each cycle of the program, the prompt is displayed first, which usually consists in a dollar sign $ that works as a cursor where the user can enter a command like “ls -l”.

The shell reads this input with a function called getline(). This is the syntax

getline(&buffer,&size,stdin);

The buffer is the address of the first character position where the input string is stored. The size is the address of the variable that contains the size of the input buffer (a pointer). And finally, the stdin is the input file handle. getline() is used to read a line of text in standard input and stores the address of the buffer of the user input.

String Tokenization

After entering an input the shell splits it into words, and stores it in a array of strings, using the function strtok(), according to a delimiter which define the tokens boundaries, which are commonly spaces, and replace them by null bytes ‘\0’. The first token is comsider as the main command in this case ls .

Alias and Expansions

The shell will first search for special characters such as “, ‘,`, \, *, &, #, etc. The * is a special character called a wildcard that can be used to tell the shell to only search for and list files that end in a certain suffix. Shell will also verify the alias, which is a shortcut name for a command, file, or anything in the shell, and replace it wiht the full command.

Built-ins

If the shell doesn’t find any special characters or aliases, it will check for built-in elements. The elements integrated in the shell are commands or functions that are executed directly in the shell, like “exit” (command to exit the shell), and “env” (command to print the enviromental variables).

PATH

If the command don’t match wiht any built-in the shell searchs in the variable $PATH wich contains group of predefined directories devided by : in which the shell will search for the commands or programs to execute. This saves time since the system will not have to search all the controls for the program to run. For this reason, the system, in case the directory does not appear in the PATH, will not be able to execute the program until we give it the exact path where it is. this is an example of a PATH variable.

The shell appends the command to each directory, until finds the path to the executable corresponding to the command, for ls, it is /bin/ls and check if it exists with the function stat() that returns 0 if so, and them checks if the file it found is an executable using the acces() function.

If the stat() function returns -1, which means that no file with that name was found, the shell shows the following error.

As we can see the error is made up of the name of the program that runs the shell ./hsh, followed by the number of the command 1, the command fake_command and the message not found.

Command Execution

If the command is found and is executable file the shell starts its execution, using two functions, fork() and execve().

Before executing a process, the shell will fork itself. fork() is a system call that creates a new process (called a child process) by duplicating the calling process, both running currently, and with different identifications numbers or pid, which allow the system to recognize each of them. This child process is the one the shell use to execute the function .

https://github.com/Ramonrune/fork-threads-processes/blob/master/README.md

After the child process is created, the system call execve(), which recives the command, a array with the command and the options, and a optional enviroment arguments (“bin/ls”, “-l”, NULL), will execute the command ls and will replace the child process with ls. The process will start working in the background, and the parent process will wait( with the function wait()) until the child process completes its execution. Once it does, the child process will end and only the parent process will remain. Once ls is executed, the prompt will appear again and you’ll be able to enter any other commands.

After all this the command ls -l is executed that will show something similar to this:

Hope you understand better what it takes for the shell to list your files in long form and what is really going on under the surface. We wish you the best.

--

--