No more problems with the `service` command

nosh pages:

designing Unix/Linux dæmon programs for service management (by the nosh service manager, and by many others going back to 1992, in fact)

The nosh service command, supplied as part of the System 5 shims package does not have the well-known problems that FreeBSD's/TrueOS's/NetBSD's service command has.

You can safely use it from a login session, or from a WWW service, or if you read a different language to the other system administrator(s). You can also safely use it from package installation and uninstallation utility scripts, although this is not recomended as such scripts really should take administrator-defined "presets" and "policies" into account, which service does not.

You can use it to manually start dæmons that you have configured not to auto-start at system bootstrap.

Background

Inheritance and the daemonization fallacy

The service commands in the Linux, FreeBSD/TrueOS, and NetBSD rc systems function quite simply: Given a start, stop, or restart command, they invoke a script from /etc/init.d/, /etc/rc.d/, or (in the case of the poor multiple personality RedHat people) /etc/rc.d/init.d/ to do the grunt work. service cron start becomes /etc/rc.d/cron start, which in turn forks off the dæmon process to be started.

The problem with this is that it relies upon the flawed notion that one can fork a process from an interactive login session and convert that process into a dæmon running in a known and stable execution context, free from security loopholes. Despite a wealth of computer folklore to the contrary, including books and manual pages for optimistic library functions, this is in fact not possible.

Back in the 1980s, it would have been. All that a process then had to overcome was the controlling terminal, session, and process group hurdles. However, that was operating systems of the 1980s. Modern Unices and Linux have since accrued a whole load more trap doors through which a login process passes through the course of its long transition from part of the Trusted Computing Base to interactive login session leader, including SELinux and other "enhanced security" settings, control groups, mount namespaces, changed-ID "tainted" state (c.f. FreeBSD's issetugid()), resource limits, unmodifiable login/terminal/user names held in special kernel-space variables (c.f. AIX's setsenv command), and session names (c.f. OpenBSD's setlogin()). Several of these are one-way and inherited by all login session processes, intentionally.

This is in addition to the long-standing problems of unclean and unreliable inherited environment, open file descriptors, root and working directories, session, process group, and controlling terminal.

The result is that many system administrators can tell amusing and appalling tales of how this practice goes badly wrong, such as:

The HTTP server that had server-side scripting, where the scripts used service to start up an adjunct dæmon; but the service command, and hence the started dæmon, inherited the open file descriptor of the listening HTTP socket, thereby holding that socket in the listening state and preventing the HTTP server from cleanly restarting. (This StackOverflow question has it being inherited by the SAMBA dæmon and is but another variation on the same theme.)
The system administrator who spoke a different language and who ran service from an interactive login session configured for his locale, causing all of the started dæmon's log messages to be written in his alphabet and language, much to the confusion of everyone else.
The system administrator that did a system upgrade, only to find that all of the dæmons that the upgrade process restarted had his login session's Kerberos settings in their environments, making some of them try to employ Kerberos subsystems where they hadn't been before. (See Debian Bug #631081.)
The system administrator who ran service from an SSH session with a terminal and found the dæmons, started as processes in that terminal session, being terminated by terminal hangup at session end. (See FreeBSD Bug #215540.)
The system administrator who ran service from a login session and found that all of the dæmons that this (re)started were logged by the Linux audit subsystem as xem, because the login UID was configured in the modern fashion as immutable once set.

The NetBSD rc.d system is a system of shell scripts, and its internal system state, tracking where it is in the process of invoking all of its shell scripts, is stored in shell variables. Devin Teske once demonstrated quite graphically that all of that state is inherited by the dæmons started by rc. This was intended to show one part of the dæmonization fallacy: that dæmons started by service in an interactive login session don't have the same initial state as dæmons started by the system bootstrap. But the wealth of leaked internal rc state also shows that dæmons are not started in a particularly clean state even when not started by the service command.

The right way for an administrator to start a dæmon from an interactive login session was recognized and implemented in the early 1990s. The newly minted System Resource Controller in IBM AIX version 3 had a central service manager, named srcmstr, that listened on a socket, named /dev/SRC. The administrator started dæmons with the startsrc command, which didn't fork them directly but instead sent requests over the /dev/SRC socket to the service manager. The service manager started all dæmons up in the same, stable, process state, that wasn't ever a part of an interactive login session in the first place.

Many subsequent systems adopted this notion.

daemontools 0.51 began, in 1997, with a supervise program to actually supervise each dæmon and a svc program that talked to it, sending requests over FIFOs. Initially, the idea was to run supervise from an interactive login session, but experience taught all of the even-then well-known pitfalls of that, and soon daemontools gained a svscan and a svscanboot tool, which together ensured that the supervise processes were started in a clean, stable, secure, non-interactive, and non-login execution environment.
All of the subsequent daemontools family systems share this design. Gerrit Pape discusses it in the blurb for runit.
Nico Schottelius' cinit has a cinit program that is process #1, supervising a bunch of dæmons and handling the actual work of starting and stopping them. It sets up a socket in /etc/cinit/tmp/coala, which its cservice program sends dæmon start/stop commands over, rather than forking things directly.
Joachim Nilsson's finit has a finit program is process #1, supervising a bunch of dæmons and handling the actual work of starting and stopping them. It sets up a socket in /var/run/finit.sock, which its initctl program sends dæmon start/stop requests to, rather than forking things directly. Its new in 2015 service compatibility shim is just a thin wrapper around initctl.
Felix von Leitner's minit likewise has minit to supervise, start, and stop dæmons. Its msvc likewise doesn't fork things directly but sends commands via /etc/minit/in and /etc/minit/out FIFOs.

The Linux, FreeBSD/TrueOS, and NetBSD rc systems did not, however. And the well-known problems have been biting their users for two decades.

No manual start class

In between the "automatic" and "disabled" start classes, the Service Control Manager on Windows NT has a "manual" start class. In addition to starting a service automatically at bootstrap and not starting it at all, one can start it manually, after bootstrap. The BSD service command is not as capable.

With the BSD service command, setting the service_enable flag in /etc/rc.conf{.local,} to off disables starting that service at all, automatically at bootstrap or otherwise. This is because the flag is checked by the individual /etc/rc.d/ scripts themselves whenever they are invoked with the start action, in the common run_rc_command library function. There are just two service start classes: automatic and totally disabled.

Again, this is something that was improved upon. The IBM AIX System Resource Controller permitted services to be automatically started at bootstrap (via startsrc entries in /etc/inittab), manually started from an interactive shell (again with startsrc), and started from the System Management Interface Tool (SMIT). But it wasn't the earliest incarnation of this concept. As mentioned, the Service Control Manager on Windows NT had this concept from Windows NT's first release.

Some BSDs have since gained an undocumented onestart verb, not given in their manuals, to tell the scripts to modify their checks.

The nosh `service` command

As aforementioned, all of the daemontools family toolsets share the design of separating out the service manager and not forking directly from the control tools. The nosh toolset is one of that family.

The nosh service command is a System 5 compatibility shim command. It's a thin wrapper around the native system-control command, and its start and stop verbs. This, in turn, communicates with the nosh service-manager via the FIFOs in each service's supervise/ control/status subdirectory. service-manager hasn't ever belonged to a login session, and so can fork the dæmon process processes in a clean, stable, secure, non-interactive, and non-login execution environment.

service-manager also doesn't pass down reams of its internal data to spawned dæmon processes, because it doesn't record its state using (inheritable) environment variables. The task of calculating and processing bootstrap and shutdown service invocation order isn't even performed by service-manager in the first place. It's performed by another process still, running the system-control program to start the normal and shutdown targets, which isn't the direct parent of any dæmon processes either.

The system-control start and stop commands are unaffected by the enabled/disabled state of the service, allowing for a disabled service to be manually started and stopped.

© Copyright 2015 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this WWW page in its original, unmodified form as long as its last modification datestamp information is preserved.

No more problems with the service command

Background

Inheritance and the daemonization fallacy

No manual start class

The nosh service command

No more problems with the `service` command

The nosh `service` command