Process management and fork-exec

Process Management

The Unix-based system used fork-exec system calls for process management or executing any program.

The following system calls are used for basic process management.

  1. fork
  2. the exec family of functions
  3. exit
  4. wait

The following diagram shows the process lifecycle. A parent process invokes fork to copy itself, then it can either continue execution or wait on the child process to complete/terminate.

What happens when you run a program?

When you hit run on any application or run a command in the terminal, fork-exec system calls are used to run the program. Fork creates a new identical process and set the requested program binary path and required data to exec call for execution. Exec replaces the process code, stack, and data with the input binary code and data and starts the execution.

In the following diagram, the parent process Application X invokes fork to create a clone of itself, the forked process has its process id N2 different from the parent process id N1. On success, the parent process receives the PID of the child as a return value of the fork. Next, the child process invokes exec with the target Application Y binary path and other parameters to run that program. the exec call replaces the current process code and data with the Application Y code and data and starts the execution process.

fork

The fork system call is used to create a clone of the caller process, copying the code, stack, and data of the parent process. The distinction between the parent/caller process and the forked process is made using the process ID (PID). Fork returns twice on success, once in the parent and once in the child

  1. It returns 0 in the child process
  2. It returns the process ID of the child in the parent process

the exec family of functions

The exec family of system calls replaces the program executed by a process. It takes the program path and program parameters as input, completely erase the current process address space, and loads the input program code, data, stack, and initialized required control registers. Although all open file descriptors remain open after calling exec unless explicitly set to close-on-exec. Note that exec does not create any new process, it uses the process that invokes it.

#include <unistd.h>

extern char **environ;

int execl(const char **path*, const char **arg0*, ..., const char **argn*, (char *)0);
int execle(const char **path*, const char **arg0*, ..., const char **argn*, (char *)0, char *const *envp*[]);
int execlp(const char **file*, const char **arg0*, ..., const char **argn*, (char *)0);
int execlpe(const char **file*, const char **arg0*, ..., const char **argn*, (char *)0, char *const *envp*[]);
int execv(const char **path*, char *const *argv*[]);
int execve(const char **path*, char *const *argv*[], char *const *envp*[]);
int execvp(const char **file*, char *const *argv*[]);
int execvpe(const char **file*, char *const *argv*[], char *const *envp*[]);

All functions in the exec family do essentially the same thing: loading a new program into the current process and providing it with the required parameters. The differences are in how the program is found, how the arguments are specified, and where the environment comes from.

  • The calls with v in the name take an array parameter to specify the argv[] array of the new program. The end of the arguments is indicated by an array element containing NULL.
  • The calls with l in the name take the arguments of the new program as a variable-length argument list to the function itself. The end of the arguments is indicated by a (char *)NULL argument. You should always include the typecast, because NULL is allowed to be an integer constant, and default argument conversions when calling a variadic function won’t convert that to a pointer.
  • The calls with e in the name take an extra argument envp (or arguments in the l case) to provide the environment of the new program; otherwise, the program inherits the current process’s environment. This is provided in the same way as the argv array: an array for execve(), separate arguments for execle().
  • The calls with p in the name search the PATH environment variable to find the program if it doesn’t have a directory in it (i.e. it doesn’t contain a / character). Otherwise, the program name is always treated as a path to the executable.

exit

The exit system call is used to terminate a process with an exit status.

wait

The wait system is used by the parent to suspend the execution until a child terminates. The parent uses the blocking system call wait to get the exit status of a terminated child.
The wait is a blocking system call, execution in the calling process is suspended until wait returns.