Mistakes to avoid when designing Unix dæmon programs

You've come here because you have either perpetrated one of the classic design mistakes in your Unix dæmon, or asked a question similar to the following:

What mistakes should I avoid making when designing how my Unix dæmon program operates ?

This is the Frequently Given Answer to that question.

(You can find a different approach to this answer on one of Chris's Scribbles.)

There are many dæmon supervision and management tools that require dæmons to work as explained here, including:

Don't fork() in order to "put the dæmon into the background".

This whole idea is wrongheaded. The concepts of "foreground" and "background" don't apply to dæmons. They apply to user interfaces. They apply to processes that have controlling terminals. There is a "foreground process group" for controlling terminals, for example. They apply to processes that present graphical user interfaces. The window with the distinctive highlighting is conventionally considered to be "in the foreground", for example. Dæmons don't have controlling terminals and don't present textual/graphical user interfaces, and so the concepts of "foreground" and "background" simply don't apply at all.

When people talk about fork()ing "in order to put the dæmon in the background" they don't actually mean "background" at all. They mean that the dæmon is executed asynchronously by the shell, i.e. the shell does not wait for the dæmon process to terminate before proceeding to the next command. However, Unix shells are perfectly capable of arranging this. No code is required in your dæmon for doing so.

Let the invoker of your dæmon decide whether xe wants it to run synchronously or asynchronously. Don't assume that xe will only ever want your dæmon to run asynchronously. Indeed, on modern systems, administrators want dæmons to run synchronously, as standard behaviour.

dæmon supervisors assume (quite reasonably) that if their child process exits then the dæmon has died and can be restarted. (Conversely, they quite reasonably assume that they can do things like stop the dæmon cleanly by sending their child process a SIGTERM signal.) The old BSD and System 5 init programs do this. So, too, do most proper dæmon supervision toolkits from the past 30 years. Forking to "put the dæmon into the background" entirely defeats such tools; but it is ironic that it does so to no good end, because, even without fork()ing, dæmons invoked by such supervisors are already "in the background". The dæmon supervisors are already running asynchronously from interactive shells, without controlling terminals, and without any interactive shells as their session leader.

If some novice system administrator, running your dæmon from /etc/rc, /etc/rc.local, or similar (rather than from init or under the aegis of one of the many dæmon supervision toolkits) then asks you how to persuade your dæmon to run asynchronously so that the script does not wait for it to finish, point them in the direction of the documentation for the shell's '&' metacharacter.

Don't assume that "foreground" means "debug mode".

Running a dæmon synchronously doesn't necessarily mean that vast quantities of debugging information are required. If (because you haven't bitten the bullet and eliminated the totally unnecessary fork()ing from your code) you have a command option that switches your dæmon between running "in the foreground" and running "in the background", do not make that command option do double duty. Running the dæmon without fork()ing should have no bearing upon whether or not debugging output from your program is enabled or disabled.

In general, don't conflate options that affect the actual operation of your program with options that affect the output of log or debug information.

Don't use syslog().

syslog() is a very poor logging mechanism. Don't use it. Amongst its many faults and disadvantages are:

Write your log output to standard error, just like all other Unix programs do.

You'll find your dæmon easier to write, to boot. Code using fprintf(stderr,…) (or std::clog) is generally easier to maintain than code using syslog().

In most dæmon supervision toolkits, there is a facility for the dæmon supervisor process to open a pipe, attach its write end to your dæmon's standard error, and attach the read end to the standard input of some other "log" dæmon. The dæmon supervisor in most toolkits also keeps the pipe open in its own process, so that if the "log" dæmon crashes and is auto-restarted (or is restarted by administrator command for some reason), unread log data at the time of the crash/restart remain safely in the pipe ready to be processed.

If a system administrator isn't using such a supervision toolkit, xe can always send your dæmons' standard errors through a pipe to splogger, logger, or sissylog. But the converse isn't true; syslog() isn't composable with other logging mechanisms if one doesn't have a specialized dæmon listening on the protocol-specific sockets. An administrator can deal with standard error if xe isn't using a toolkit that already does, but dealing with syslog() similarly is a lot harder.

systemd connects all dæmons' standard errors to the systemd "journal" dæmon through a pipe, too. Although it retains other parts of the syslog design, such as combining multiple log streams into one giant single stream.

This is not directly relevant to how a dæ:mon operates per se, but there is a known problem with combining multiple streams into one: A flood of log messages from one highly verbose (or indeed malicious) source can cause log file rotation and the loss of potentially valuable log information from another important but relatively quiet source. This is why the syslog mechanism has fan-out after the fan-in. On most systems as configured out of the box it is still relatively easy to wash important logs away in a flood, though, as there isn't very much fan-out.

By avoiding the fan-in in the first place, one avoids this problem more neatly. The various daemontools-family toolsets allow "main" dæmons to have individual "log" dæmons connected via individual pipes; thereby allowing for completely disjoint streams of data, with individual disjoint log rotation and size policies per dæmon if desired, and (for maximum security) every "log" dæmon run under the aegis of its own individual user account, that uses the operating system's own account permissions mechanisms to protect the "log" dæmon process and its log files/directories from interference by users, other "log" dæmons, and even the "main" dæmon whose output is being logged.

Don't deal with TCP/IP directly.

Let programs such as inetd, tcp-socket-listen and tcp-socket-accept from the nosh package, tcpserver (from Dan Bernstein's UCSPI-TCP), sslserver (from UCSPI-SSL), or tcpsvd (from Gerrit Pape's ipsvd) deal with the nitty gritty of opening, listening on, and accepting connections on sockets. All that your program needs to do is read from its standard input and write to its standard output. Then if someone comes along wanting to connect your program to some other form of stream, xe can do so easily.

Make your program into an application that is suitable for being spawned from a UCSPI server. If you really do need to have access to TCP-specific information, such as socket addresses, don't call getpeername() and so forth directly. Parse the TCP/IP local and remote information that should be provided by the UCSPI server in the TCP environment variables. (For one thing, a system administrator will find it a lot easier to test an access-control mechanism that is based upon $TCPREMOTEIP than to test one that is based upon getpeername().)

This design allows you to more closely follow the Principle of Least Privilege, too. If your program were to handle the TCP sockets itself and the number of the port that it used was in the reserved range (1023 and below), it would need to be run as the superuser. All of the code of your program would need to be audited to ensure that it had no loopholes through which one could gain superuser access. If, however, your program relied on tcpserver, sslserver, or tcpsvd to perform all of the socket control, it could be invoked under a non-superuser UID and GID via setuidgid. Loopholes in your program would only allow an attacker to do things that that UID and GID could do. If the UID and GID did not own and had no write access to any files or directories on the filesystem, for example, and no other processes ran under the same UID, then an attacker who compromised your program could do very little (apart from disrupt the operation of your program itself).

If your program is one of the (exceedingly) rare cases where you do need to create, listen on, and accept connections on sockets yourself, allow the system administrator full control over the IP addresses (and port numbers) that your program will use.

Don't create PID files in /run (or anywhere else).

Creating PID files in /run (a.k.a. /var/run) has all sorts of flaws and disadvantages, among them:

Let the system administrator use whatever dæmon supervisor is invoking your dæmon to handle killing the correct process. dæmon supervisors don't need PID files. They know what process IDs to use because they remember it from when they fork()ed the dæmon process in the first place. init doesn't need a PID file in /var/run to tell it which of its children to kill when the run level changes, for example.

daemontools has been described as "/var/run done the right way". The other tools in the daemontools family toolkits use the same approach. With all of them, there is no need for PID files. dæmons are controlled with the svc (or runsvctrl) commands, which are what shutdown scripts should use instead of kill, pkill, and the like.


© Copyright 2001–2004,2007,2014 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp is preserved.