Introduction to redo

redo is a package originally designed by Daniel J. Bernstein in 2003. Its purpose is to provide a build system for software packages that does incremental builds, i.e. if the package is built and then some of its source files are changed, the build system will only re-run that part of the build procedure that is necessary to re-build the changed parts of the package.

Although Bernstein apparently did write the tools, as traces of them can be found in the build systems of some of the packages that he did release, he never got around to properly packaging up and releasing them. Two people picked up the ball and ran with it, however:

If you want to see some of the aforementioned traces, look at Bernstein's ptyget package, published in 1996. It has .do files that run (undocumented and unpublished) programs named dependon, dependcc, formake, and directtarget. Whilst evidently, given formake which seems to be a tool for generating Makefiles a snippet at a time, this is not the redo system as envisaged five years later, dependon is a clear precursor to redo-ifchange.

I have a consolidation of Bernstein packages where not only have I replaced the .do files for ptyget with onesthat invoke the redo tools as we know them, but I have also made all of the rest of the packages build with redo too.

The Basics

Both the Grosskurth and Pennarun implementations share a set of fundamental concepts, from Bernstein's design:

Maxims

Several maxims have driven the design and the implementations of redo:

An example system

Since the redo command is unconditional, and will always result in a "do" file being run, targets at the very top level of a build system are effectively pseudo-targets, and can be given the common pseudo-target names such as all:

# all.do
redo-ifchange example.exe

Note that there's little use for redo-ifcreate in a pseudo-target.

By making the commands for compiling and linking into scripts, importing the compiler options from short text files, one can ensure that whenever compiler options or the compiler command itself change, redo will flag all appropriate targets as out of date and re-build them with the new compiler/compiler options:

# example.exe.do
objects="example.o wibble.o"
redo-ifchange ./link ${objects}
./link $3 ${objects}
#!/bin/sh -e
# link
redo-ifchange ./cxx ./cxxflags ./ldflags
read -r cxx < ./cxx
read -r cxxflags < ./cxxflags
read -r ldflags < ./ldflags
${cxx} ${cxxflags} ${ldflags} -o "$@"

There are several possible variants of the above, of course. The link script could be part of the "do" script. The command options could be embedded directly in the "do" script as well. The aforegiven arrangement allows for fan-in: multiple target executables can share a single link script, and multiple link scripts can share a single cxx file (whilst, say, having different ldflags files listing different sets of link libraries for differing groups of linked executables).

Although, as previously mentioned, C/C++ compilers don't generate proper dependency information for files that they have relied upon the non-existence of, generation of the half of the dependency information relating to existing files is fairly trivial:

# default.o.do
redo-ifchange ./compile $1.cpp ./convert-depend
./compile $3 $1.cpp
./convert-depend $3 $1.d | xargs redo-ifchange
#!/bin/sh -e
# compile
redo-ifchange ./cxx ./cxxflags ./cppflags
read -r cxx < ./cxx
read -r cxxflags < ./cxxflags
read -r cppflags < ./cppflags
${cxx} ${cxxflags} ${cppflags} -MMD -c "$2" -o "$1"
#!/bin/sh -e
# convert-depend
exec sed -e "s/^$1://" -e "s/\\//g" "$2"

Observations

C/C++ compiler deficiencies that inhibit proper use of redo

As noted earlier, no C or C++ compiler currently generates any redo-ifcreate dependency information, only the redo-ifchange dependency information. This is a deficiency of the compilers rather than a deficiency of redo, though. That the introduction of a new higher-precedence header earlier on the include path will affect recompilation is a fact that almost all C/C++ build systems fail to account for.

I have written, but not yet released, a C++ tool that is capable of generating both redo-ifchange information for included files and redo-ifcreate information for the places where included files were searched for but didn't exist, and thus where adding new (different) included files would change the output:

/package/admin/nosh % /package/prog/cc/command/cpp --MMD build/system-control.cpp >/dev/null 2>&1
/package/admin/nosh % cat build/system-control.cpp.d
redo-ifcreate utils.h fdutils.h service-manager-client.h runtime-dir.h home-dir.h popt.h
redo-ifchange build/utils.h build/fdutils.h build/service-manager-client.h build/runtime-dir.h build/home-dir.h build/popt.h
/package/admin/nosh %

Incremental build requirements that redo handles well

redo handles four common incremental changes to an already-built system easily:

Structural deficiencies in the basic design of redo

Like make, redo assumes a 1:M relationship between targets and dependencies. And like make, it is quite difficult to deal with a project where one build instruction generates multiple targets all together in one go. (An example of this is a parser generator, which generates multiple .cpp and header files from a single grammar file.)

In addition to this problem of the source:target ratio, redo further assumes a fairly basic arrangement of the "do" files and target files with respect to the source file tree, that makes for good examples, but has difficulties with more advanced strutures that occur in real world development where things are more complex:

What redo specifically lacks is the concept of three separate directory trees for source files, "do" files, and target files, allowing multiplatform and multicompiler development, read-only source with honest dependencies and debugging information that points to the right filenames, and side-by-side debug/release builds.

Problems with the implementations of redo

Avery Pennarun's redo implementation forks processes excessively and duplicates the kernel's, and indeed the shell's, #! processing.

This is an unfortunate byproduct of its assuming that "do" files are Bourne shell scripts and only requiring #! when they are not. It passes all "do" scripts directly to the Bourne shell, and thus has to check for #! and do its own #! processing duplicating, not necessarily exactly or correctly, what the kernel does. In particular, the #! implementation in Pennarun redo as of 2012 has several of the well-known and widely documented #! security holes that have long since been closed in actual operating system kernel implementations of the same mechanism, including the script filename race condition. It also has an incorrect magic number check and an additional PATH security hole caused by how it attempts to run the Bourne shell.

A better approach would have been to use the Unix philosophy as it stood: a program to be run is just run, with execve(), as-is; "do" scripts are expected to all have #! to identify their particular interpreters; the system relies on the operating system kernel's processing of #! and doesn't roll its own one badly; and there is no hardwiring of one particular shell.


© Copyright 2012 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp information is preserved.