The known problems with Dan Bernstein's djbdns

You've come to this page as a result of a question similar to the following:

What are the known problems with Dan Bernstein's djbdns?

This is the Frequently Given Answer to that question.

As of version 1.05, there are various problems with djbdns:

The patches fixing these problems, hyperlinked-to here, are all incorporated into the djbwares packages.

None of these problems are security problems. You won't be able to claim a reward by reporting them. djbdns may be doing the wrong things, but it isn't exposing a security vulnerability as a result.

Ironically, although there are a handful of vociferous people who like to propagate various myths about djbdns, in the several years that it has been published none of those people has spotted any of these actual problems with djbdns. ( But then these problems are only discovered through experience of using the package, whereas several of the myths are so obviously wrong that they can only be believed by those who haven't even read the documentation, let alone actually obtained and used the software. )

The failure to build when using modern versions of GNU's C library

Stock djbdns fails to build (with an error at the link stage complaining about an undefined reference to the symbol errno) when using modern versions of GNU's C library. This is because it contains a programming error that was pointed out and described in detail back in July 2001. It assumes something that has never been true for Standard C, even at its inception in 1989. Heretofore, djbdns has only built successfully because Unix and Linux C implementations accidentally happen to work the way that it needs them to. But now, they do not.

This programming error is prevalent throughout all of Dan Bernstein's softwares. The same bug occurs in qmail and in daemontools, for example.

Mate Weirdl published patches that correct this error for many of Dan Bernstein's softwares, including djbdns, on his FTP site.

Shantanu completed the job, and in 2003 published a set of patches for correcting this error in all of Dan Bernstein's softwares, on his web site. (Unfortunately, now, both <URL:http://tyskyshop.com./patches/patches.html> and <URL:http://www.shantanu.cjb.net./> no longer point to Shantanu's web site and Shantanu himself has disappeared.)

The lack of semantic error handling in tinydns-data

Stock tinydns-data doesn't handle semantic errors in its input:

This patch fixes all of these problems.

Truncation of alias chains by tinydns and axfrdns

According to the algorithm in RFC 1034 § 4.3.2, a content DNS server must search for a "CNAME" resource record for the domain name being queried, and if it finds one it must substitute the domain name from the data portion, add the resource record to the answer section of the response, and restart processing.

Other content DNS server softwares all do this. ISC's BIND does. Microsoft's DNS server does.

But tinydns and axfrdns don't do this.

Instead, tinydns and axfrdns add the resource record and finish. The result is that instead of publishing a complete answer, comprising the entire alias chain and then the resource record set for the eventual target domain name, they publish a partial answer, comprising just the first link in the alias chain and then a referral. tinydns and axfrdns think that they are publishing authority data about the "CNAME" resource record. But in fact, according to the rules in RFC 2308 § 2.1, what they are publishing is properly interpreted as an alias chain ending with a referral.

Furthermore, and somewhat ironically, because tinydns and axfrdns think that they are publishing authority data, the delegation information in the referral is for the query domain itself or for some enclosing superdomain. In most cases, this is either a referral that is outside of the bailiwick for the content DNS server or a referral from that server back to itself. Resolving proxy DNS servers regard both kinds of referral as indications of a "lame server". Some resolving proxy DNS servers will mark "lame" content DNS servers as bad and try not to talk to them for a period. (dnscache does this. However, to additionally compound the irony, it contains a bug in its response handling whereby it doesn't properly detect lame referrals in the presence of client-side aliases, and so completely masks this problem. Other resolving proxy DNS server softwares correctly note tinydns+axfrdns servers as being "lame", however; albeit that BIND specifically recognises this particular error and logs it under its own heading as a "dangling CNAME pointer".)

Example 1:

Consider tinydns and axfrdns serving up the DNS database

.example:
Ctest.cname.example:target.cname.example
and responding to an "A" query for "test.cname.example".

The correct response is a type 2 response from RFC 2308 § 2.1, giving the alias for "test.cname.example" and with the "no such name" error code indicating, as RFC 2308 § 2.1 says, that "target.cname.example" does not exist:

1 test.cname.example:
107 bytes, 1+1+1+0 records, response, authoritative, nxdomain
query: 1 test.cname.example
answer: test.cname.example 86400 CNAME target.cname.example
authority: example 2560 SOA ns.example hostmaster.example 1057502685 16384 2048 1048576 2560

And, indeed, that is the response that other content DNS server softwares all give (when provided with equivalent DNS database content). With the patch applied, tinydns and axfrdns do so too.

Uncorrected, however, the response published by tinydns and axfrdns is:

1 test.cname.example:
74 bytes, 1+1+1+0 records, response, authoritative, noerror
query: 1 test.cname.example
answer: test.cname.example 86400 CNAME target.cname.example
authority: example 259200 NS ns.example

This response means something entirely different. It is a referral response from RFC 2308 § 2, giving the alias for "test.cname.example" as before and a referral (which is, of course, a lame self-referral) for "example.".

Example 2:

Run

dnsq a cname.ketil.froyn.name a.ns.ketil.froyn.name.
to see stock tinydns following its incorrect algorithm.

Run

dnsq a ia.imdb.com. dns1.imdb.com.
dnsq a visit.geocities.com. ns1.yahoo.com.
dnsq a www.likesystems.com. ns1.granitecanyon.com.
dnsq a www.idi.oclc.org. ns2.opentext.com.
to see several content DNS servers, running other content DNS server softwares, following the correct algorithm.

The reason that tinydns and axfrdns do what they do appears to stem from a false taxonomy of DNS responses. Whilst a resolving proxy DNS server is free to choose to ignore everything in a response apart from the first link in the chain of client-side aliases, that is not what the response is deemed to actually contain. The algorithm in RFC 1034 and the response taxonomy in RFC 2308 both agree that a response contains a chain of client-side aliases followed by the information (an error, a referral, or a resource record set) for the name at the end of the chain.

This patch fixes this problem.

Mis-handling of client-side aliases ("CNAME" records) in dnscache

Stock dnscache has lots of problems with client-side aliases:

This patch fixes all of these problems.

The outdatedness of the list of ICANN's "." content DNS servers in dnsroots.global

On 2002-11, 21 months after the release of version 1.05 of djbdns, ICANN changed the IP address of one of its "." content DNS servers from 198.41.0.10 to 192.58.128.30. As of 2003-05, and "for the forseeable future" (to quote the ICANN announcement), the two content DNS servers are running in parallel, with the old one serving up the same data as the new one.

Later, on 2004-01-29, ICANN changed the IP address of another of its "." content DNS servers from 128.9.0.107 to 192.228.79.201. This time, an explicit cutoff was given. For 24 months from the change, the service will be provided in parallel on both addresses.

When, eventually, the old ICANN content DNS servers are taken down (and presuming that one has chosen to remain using the diminutive root and ICANN's "." content DNS servers), every time that dnscache has to start query resolution from the root of the namespace, it will query a non-existent server 2/13ths of the time. This will slow query resolution down slightly, as processing will have to time out before dnscache then tries asking another server.

These instructions show how to update the lists of "." content DNS servers that are used by djbdns. They apply generally, whatever "." content DNS server organization one has chosen to use. They will update the list to be whatever the list of "." content DNS servers for one's chosen organization currently is.

This patch specifically and solely updates the list of ICANN's "." content DNS servers to reflect the change that ICANN made on 2002-11.

This patch specifically and solely updates the list of ICANN's "." content DNS servers to reflect the two changes that ICANN made on 2002-11 and 2004-01.

The DNS Client library used by the tools recognises a full stop character as the address separator in ${DNSCACHEIP}.

In stock djbdns, the DNS Client library allows multiple proxy DNS servers to be specified in the value of the ${DNSCACHEIP} environment variable, but uses the unfortunate syntax of separating those addresses with the full stop ('.') character. This has two problems:

This patch fixes this problem.

dnscache doesn't fully implement "forwardonly" mode, creating the possibility of other people's published delegation data causing a proxy loop.

When used as a resolving proxy DNS server, stock dnscache avoids proxy loops by using the RD bit. It sends its back-end queries with the RD bit set to 0, and only responds to front-end queries that it receives that have the RD bit set to 1.

dnscache can also be configured to (almost) be a forwarding proxy DNS server, by setting the ${FORWARDONLY} environment variable (to anything). In this mode, it sends its back-end queries with the RD bit set to 1. The RD bit is thus unusable as a mechanism for avoiding proxy loops in "forwardonly" mode.

With a proper forwarding proxy DNS server, eliminating proxy loops is purely a local administrative matter. The DNS administrator is responsible for simply ensuring that the server is not configured to forward back to itself (either directly or indirectly, via other forwarding proxy DNS servers).

However, in "forwardonly" mode stock dnscache is not a proper purely forwarding proxy DNS server. The manual page for dnscache says that in "forwardonly" mode it

forwards queries […], rather than contacting a chain of servers according to NS records.
but this is not actually true in practice. Even in "forwardonly" mode, stock dnscache will follow referral responses if it receives them, and thus proceed to contact a chain of servers according to "NS" resource records.

This error creates the possibility of a proxy loop caused by data outside of the control of the local DNS administrator. The circumstances that trigger it are:

When dnscache resolves queries for that domain or for one of its subdomains, it queries the forwardee. Because of the forwarding misconfiguration, the forwardee responds with a referral containing the delegation data. Despite being in "forwardonly" mode, dnscache follows the referral to the next server(s) in the chain, according to the "NS" resource records. It sends a query to the delegated server(s).

Because of the duff delegation data, one or more of those servers happen to be itself, so it ends up sending the query to itself. Because the query has the RD bit set to 1 (as a result of dnscache being in "forwardonly" mode), dnscache accepts it. Because dnscache does not combine identical front-end queries from separate clients, it starts a second, parallel, query resolution for the same query. This, in turn, sends a query back to dnscache and generates a third query. And so forth.

Very quickly (within a few seconds), most reasonably-sized dnscache logs are completely flooded and all available query resolution slots are in use, processing the same query over and over again. dnscache becomes CPU bound, continuously processing queries that it has sent to itself (usually over a loopback network interface and thus involving no actual I/O at all) and generating new ones.

Each query resolution times out without receiving an answer, and so never sends an answer itself. The delegation is effectively lame, and the cycle never breaks.

It is relatively easy to replicate this problem by

Dan Bernstein flatly denies that this problem exists.

This patch fixes this problem.


© Copyright 2003–2004 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp is preserved.