The gen on the UNIX Client-Server Program Interface

The UNIX Client-Server Program Interface is a mechanism for passing information about socket clients and servers to otherwise socket-agnostic programs. It permits otherwise fairly ordinary stream-processing tools to be used with sockets, without their needing special code.

The original formulation of the basic UCSPI was Daniel J. Bernstein's UNIX Client-Server Program Interface published in 1996. This was a transport-neutral protocol. On top of it, he defined specifics for the TCP transport protocol, a TCP UCSPI protocol definition, also published in 1996. He published a set of tools for handling SOCK_STREAM sockets in the AF_INET family that adhered to this protocol.

This was fairly obviously extensible to other protocols, and other people fairly promptly extended it in the following few years. In 2000, Bruce Guenter came out with a set of tools for handling SOCK_STREAM sockets in the AF_LOCAL (a.k.a. AF_UNIX, hence the name) address family, and an accompanying UNIX UCSPI protocol definition. Only a day later and independently, William Baxter came out with a set of tools for handling the same thing and an accompanying IPC UCSPI protocol definition. M. Baxter also came out with a set of tools for handling SSL/TCP sockets and an accompanying SSL UCSPI protocol definition. This became a small sub-family of tools. Scott Gifford came out with a set of tools for handling TLS/TCP sockets and an accompanying TLS UCSPI protocol definition.

Both M. Baxter and M. Bernstein requested that people register sub-protocols. But in fact UCSPI-IPC versus UCSPI-UNIX is the only conflict that there has ever been in almost two decades, and there aren't that many real-world widely-used transport mechanisms that don't have a sub-protocol definition at this point.

A precursor to UCSPI was the Service Access Facility in AT&T Unix System 5, which was oriented towards AT&T STREAMS rather than BSD sockets but had similar ideas.

Environment variables

The primary environment variable shared by all of the sub-protocols is the PROTO environment variable. Its value declares what sub-protocol is in effect; and is also the prefix of all of the names of the variables used by the sub-protocol.

If it has the value TCP then the protocol is UCSPI-TCP, and the remaining variables are:

TCPLOCALIP
the IP address of the local host, in standard human-readable form
TCPLOCALPORT
the local TCP port number, in decimal
TCPLOCALHOST
a name listed in the DNS for the local host, unset if no such name is available/obtained
TCPREMOTEIP
the IP address of the remote host, in standard human-readable form
TCPREMOTEPORT
the remote TCP port number, in decimal
TCPREMOTEHOST
a name listed in the DNS for the remote host, unset if no such name is available/obtained
TCPREMOTEINFO
a string supplied by the remote host for the connection at hand via the 931/1413/IDENT/TAP protocol, unset if none is available/obtained

If it has the value TCP6 then the protocol is the (old) IPv6 enhanced UCSPI-TCP, and the remaining variables are:

TCP6LOCALIP
the IP address of the local host, in standard human-readable form
TCP6LOCALPORT
the local TCP port number, in decimal
TCP6LOCALHOST
a name listed in the DNS for the local host, unset if no such name is available/obtained
TCP6REMOTEIP
the IP address of the remote host, in standard human-readable form
TCP6REMOTEPORT
the remote TCP port number, in decimal
TCP6REMOTEHOST
a name listed in the DNS for the remote host, unset if no such name is available/obtained
TCP6REMOTEINFO
a string supplied by the remote host for the connection at hand via the 931/1413/IDENT/TAP protocol, unset if none is available/obtained

If it has the value UNIX then the protocol is M. Guenter's UCSPI-UNIX, and the remaining variables are:

UNIXLOCALPATH
the path associated with the local-domain socket
UNIXLOCALUID
the UID of the local process
UNIXLOCALGID
the GID of the local process
UNIXLOCALPID
the PID of the local process
UNIXREMOTEEUID
the effective UID of the remote process, if the socket supports obtaining it
UNIXREMOTEEGID
the effective GID of the remote process, if the socket supports obtaining it
UNIXREMOTEPID
the PID of the remote process, if the socket supports obtaining it

If it has the value IPC then the protocol is M. Baxter's UCSPI-IPC, and the remaining variables are:

IPCLOCALPATH
the file name associated with the local socket
IPCREMOTEPATH
the path associated with the remote socket
IPCREMOTEEUID
the effective UID of the remote process that called connect, if the socket supports obtaining it
IPCREMOTEEGID
the effective GID of the remote process that called connect, if the socket supports obtaining it

If it has the value SSL then the protocol is M. Baxter's UCSPI-SSL, and the remaining variables are:

SSLLOCALIP
the IP address of the local host, in standard human-readable form
SSLLOCALPORT
the local SSL port number, in decimal
SSLLOCALHOST
a name listed in the DNS for the local host, unset if no such name is available/obtained
SSLREMOTEIP
the IP address of the remote host, in standard human-readable form
SSLREMOTEPORT
the remote SSL port number, in decimal
SSLREMOTEHOST
a name listed in the DNS for the remote host, unset if no such name is available/obtained
SSLREMOTEINFO
a string supplied by the remote host for the connection at hand via the 931/1413/IDENT/TAP protocol, unset if none is available/obtained

Softwares

It's a bit misleading to talk of software support for UCSPI, because any UNIX program that can speak simple stream-based I/O over inherited file descriptors is theoretically compatible. That is, after all, the point of the design. Softwares that allow one to construct UCSPI clients and servers, built with such otherwise ordinary UNIX tools and doing the parts that the ordinary tools don't need to handle (or even know about), include:

The nosh toolset has similar tools for handling SOCK_DGRAM sockets in the AF_INET/AF_IPV6 address family and UNIX FIFOs. But there are no defined UCSPI-UDP and UCSPI-FIFO sub-protocols. They make less sense in conectionless datagram-based I/O; and both UDP socket and FIFO-based I/O lack the listen-accept server model, where UCSPI sub-protocols usually fit in at the accept stage.

Only the nosh toolset and s6-networking separate the listen-accept model into separate programs, providing the capability of spawning what systemd jargon would call Accept=no (and what inetd jargon would call wait) servers that take the overall listening socket as their input rather than the socket for a single accepted connection.

The IPv6 mess

Bernstein's original protocol specifications were published a couple of years before IPv6 formally existed, and are in fact contemporary with things like RFC 1924. They don't cover IPv6. In the intervening years there have been two approaches to extending the UCSPI-TCP protocol from TCP/IPv4 to TCP/IPv6.

The approach taken by M. von Leitner and M. Hoffman, only a short while after M. Bernstein had written The IPv6 Mess, was to invent a wholly new sub-protocol for IPv6. The on-the-wire protocols were distinct; the UCSPI sub-protocols were likewise distinct. M. Bernstein's talk of being "ready and willing to make various changes to the code" never transformed into action, and the original ucspi-tcp package available from him remains IPv4-only to this day. Packages such as the FreeBSD sysutils/ucspi-tcp package use M. von Leitner's patches for IPv6 support. This adds tool command-line options for forcing IPv4 information to be passed using the TCP6 sub-protocol as IPv4-mapped IPv6 addresses.

The approach taken by the nosh package, by onetd, by GNU inetd, and by s6-networking is based upon the principle that programs and programming library functions are nowadays as cognizant of human-readable IPv6 addresses as they were of IPv4 addresses in the 1990s. Therefore server programs can and do cope if the TCPLOCALIP and TCPREMOTEIP variables contain either IPv4 addresses in dotted-decimal human-readable form or IPv6 addresses in hexadecimal-with-colons human-readable form. Therefore two sets of environment variables with different names are unnecessary. In one sense this is the reverse of the von Leitner/Hoffman approach. Whilst the von Leitner/Hoffman approach can end up with both IPv4 and IPv6 information passed using the one TCP6 sub-protocol; this approach ends up with both IPv4 and IPv6 information passed using the one TCP sub-protocol.

It is worth noting that there are in practice few examples in the wild of servers that speak the von Leitner/Hoffman TCP6 sub-protocol. It seems to be the case that the nosh/onetd/GNU inetutils/s6-networking approach is what gets used in practice.

Security

Unreliable sources of information and variable leakage

Most of the sub-protocol specifications warn of the dangers of the various …HOST and …INFO environment variables. These contain attacker supplied and controlled information, which at the very least can contain unexpected non-alphanumeric characters that script writers should be careful about. A common design in most toolsets is to provide a database of service access control rules against which the environment variables are looked up. Common advice for this is to not base rules upon the …HOST and …INFO environment variables.

The server programs in the non-inetd toolsets also all provide command-line options that will stop the servers from attempting to even look the information up, which would incur communication with potentially attacker-controlled IDENT and content DNS servers. Sadly, only a few toolsets default to not doing such lookups; the rest following M. Bernstein's unfortunate original lead of making the secure approach a non-default case.

Notice that their behaviour is not to just leave existing environment variables by those names alone when they are configured not to do lookups, but to explicitly unset any existing variables containing indeterminate data that they might have been spawned with. An explicit security guarantee of UCSPI, given in the original protocol specification, is that all of the variable contents have originated properly and form a coherent set. It should not be possible for unrelated UCSPI environment variables to "leak" into a server or client program's environment from some previous program or grandparent process.

M. Bercot's s6-networking tools do not adhere to this part of the UCSPI specification. Neither does M. Sampson's onetd. This is because whilst they make no DNS nor IDENT lookups, they do not remove any pre-existing …HOST and …INFO environment variables. Beware that the same is also true of inetd programs other than the inetd from GNU inetutils; not all inetd's are the same by a long chalk. Sergey Poznyakoff's UCSPI patch to GNU inetd provides the correct security guarantee, unsetting all variables before then setting the ones that it does. But, in stark contrast, the support in this ucLinux inetd from Alex Belits is befuddled (half-including some only half-implemented "UCSPI-UDP" support in a code path that UDP never even reaches) and like s6-networking and onetd lacks the guarantee of unsetting all not explicitly set environment variables.

Security by design

Part of the fundamental design of UCSPI is to use UNIX operating system facilities to secure servers against exploits.

More IPv6 mess

Whilst people such as Jun-ichiro itojun Hagino railed against IPv4-mapped IPv6 addresses as a security problem in the 2000s, von Leitner's/Hoffman's toolsets with the IPv4-mapped IPv6 options switched on actually prevent many of the problems railed against. An administrator cannot configure a straight IPv4 rule and forget to configure a corresponding IPv4-mapped IPv6 rule (thereby creating a possible weakness). The IPv6 rules are the only rules, and the only way to make a rule covering IPv4 covers both straight IPv4 and IPv4-mapped IPv6 addresses in one. This is not the case with the nosh/onetd/GNU inetutils/s6-networking approach.

In fairness, this is not something that is specific to UCSPI; as it relates to any toolset that has access control rules expressed in terms of IPv4 or IPv6 addresses.


© Copyright 2015 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp is preserved.