One of my projects over the upcoming semester is to explore the Linux boot process and the job of the init (initialization) system. The init system is considered PID 1 and is responsible for a significant portion of userspace functionality. Common init systems on Linux are OpenRC and systemd. One of the key parts of an init system is to spawn new processes, like in command shell.
Executing child processes may be useful to any number of programs, but common applications include:
- Shells
- Init systems
- Launchers
- Interfacing with command line applications
Using libc
If you are familiar with C/C++, or other languages, you may have used one of the following functions:
int ;
int ;
int ;
int ;
int ;
Each of these functions provide some variant of process spawning. Most of these are still available in Rust if you so desire. They can be accessed in libc
. Note these are unsafe C bindings.
pub unsafe
This unsafe
access is exactly what it says on the box, unsafe. We'll be forced to use C constructs directly and manipulate raw pointers, this is not an ideal scenario. In order to properly use this code we'd need to construct safe wrappers. But surely there is a better way?
Using std::io::process
The std::io::process
module provides robust facilities for spawning child processes. In particular Command
allows us to build and spawn processes easily.
Introducing Command
std::io::process::Command
, aliased as std::io::Command
, is a type that acts as a process builder. The Command::new()
command sets up several sane defaults for the program for you.
The various builder functions allow for customization over the defaults, which are:
- No arguments to the program
- Inherit the current process's environment
- Inherit the current process's working directory
- A readable pipe for stdin (file descriptor 0)
- A writeable pipe for stdout and stderr (file descriptors 1 and 2)
In and Out
In a simple example, lets collect the output of ps aux
:
use Command;
It should be noted that this is a blocking call, meaning the current task will halt until the completion of the process. This is acceptable for simple calls to the underlying operating system. .output()
will handle all the tasks related to piping and spawning for you.
If you're dealing with multiple arguments, you can pass a slice, like so:
let the_output = new.args.output
.ok.expect;
Spawning and managing the Children
Waiting for the command to completely return is kind of lame. What would be better some way to keep track of the process and communicate with it.
The Process
type returned by .spawn()
does just this.
use Command;
We can get the PID of the child:
// Get the PID of the process.
println!;
Or signal the process:
// Signal the process.
// 0 is interpreted as a poll check.
match the_process.signal
Note: .signal_exit()
and .signal_kill()
are also available.
Wait for the process before returning, receiving it's status:
// Wait for the process to exit.
match the_process.wait
Gotcha: Some processes will not exit until you drain their STDOUT.
Retrieve STDOUT, interacting with it like any Reader:
// Get a Pipestream, which implements the Reader trait.
let the_stdout_stream = the_process.stdout.as_mut
.expect;
// Drain it into a &mut [u8].
let the_stdout = the_stdout_stream.read_to_end
.expect;
Pipe into STDIN, also, wait for output and exit:
use Command;
Rust's borrow check ensures that the process cannot be closed until it is safe to.
Without the scope, the lifetime of the the_stdin_stream
would still exist when we try to call the_process.wait_with_output()
. If it was the case that this wasn't tracked, it's possible that the_stdin_stream
might be used even after the process is closed, something unsafe. We use a scope to limit the lifetime of the_stdin_stream
, a function could also accomplish this. More info on lifetimes.
Init's Perspective
An init system concerned about more then just the output of a process. It's concerned about the entire lifetime, which user ID runs it, what kind of ENV is exposed to it, what other processes depend on it, and where its STDOUT and STDERR go. So what would a full call to Command
look like for an init system?
Lets say we want to spawn curl
, a very long running process, and map it's STDOUT and STDERR to files. We'll also explicitly declare which user and group it should run as, as well as it's CWD and ENV variables.
In it's simplest form:
extern crate native;
extern crate rustrt;
use ;
use file;
use rtio;
An init system often tracks many processes, how could you use the above code in a setting where multiple processes are needed? How could we utilize various constructs to monitor and augment the capabilities of a system?
This is only the humble beginning.