Whenever getaddrinfo needed network interface information it used the
netlink interface to read the information every single time. The
problem is that this information can change at any time.
The patch implements monitoring of the network interfaces through
nscd. If no change is detected the previously read information can
be reused (which is the norm). This timestamp information is also
made available to other processes using the shared memory segment
between nscd and those processes.
nscd can clear caches when certain files change. The list of files
was hardcoded so far and worked for nss_files and nss_dns and those
modules which need no monitoring. nss_db, for instance, has its
own set of files to monitor. Now the NSS modules themselves can
request that certain files are monitored.
When readding entries to the group and services cache and the lookup
is unsuccesful, we tried to write the notfound record. Just don't
do it in this case.
The nscd/*cache.c files contain assert()s, writeall() and sendfileall() calls
that invalidly use together &dataset->resp and total where either dataset or
dataset->head.recsize should be used instead one of the components. In the
writeall() and sendfileall() cases, it is unlikely to matter in practice, but
the assertions can fail sometimes without a proper reason.
The commit 20e498bd removes the pthread_mutex_rdlock() calls, but not the
corresponding pthread_mutex_unlock() calls. Also, the database lock is never
unlocked in one branch of the mempool_alloc() if.
I think unreproducible random assert(dh->usable) crashes in prune_cache() were
caused by this. But an easy way to make nscd threads hang with the broken
locking was.
There are two issues with the forced loop exit in the nscd lookup:
1. the estimate of the entry size isn't pessimistic enough for all
databases, resulting potentially is too early exits
2. the combination of 64-bit process and 32-bit nscd would lead to
rejecting valid records in the database.
The nscd database mapped in processes can change at any time. We
have to be more vigilant when it comes to using that memory. Test
the data entries are valid in their entire size, don't read data
again from memory once we verified it, and make sure the trailing
pointer is not going off the deep end.
Because we are not shutting down the other threads first another
thread might work on a query before the process shuts down. In this
case the now uninitialized libselinux and libaudit might be used.
Just don't free the resources. It's not necessary anyway because
the process is about to terminate.