As indicated by BZ#23178, concurrent access on some files read by nscd
may result non expected data send through service requisition. This is
due 'sendfile' Linux implementation where for sockets with zero-copy
support, callers must ensure the transferred portions of the the file
reffered by input file descriptor remain unmodified until the reader
on the other end of socket has consumed the transferred data.
I could not find any explicit documentation stating this behaviour on
Linux kernel documentation. However man-pages sendfile entry [1] states
in NOTES the aforementioned remark. It was initially pushed on man-pages
with an explicit testcase [2] that shows changing the file used in
'sendfile' call prior the socket input data consumption results in
previous data being lost.
From commit message it stated on tested Linux version (3.15) only TCP
socket showed this issues, however on recent kernels (4.4) I noticed the
same behaviour for local sockets as well.
Since sendfile on HURD is a read/write operation and the underlying
issue on Linux, the straightforward fix is just remove sendfile use
altogether. I am really skeptical it is hitting some hotstop (there
are indication over internet that sendfile is helpfull only for large
files, more than 10kb) here to justify that extra code complexity or
to pursuit other possible fix (through memory or file locks for
instance, which I am not sure it is doable).
Checked on x86_64-linux-gnu.
[BZ #23178]
* nscd/nscd-client.h (sendfileall): Remove prototype.
* nscd/connections.c [HAVE_SENDFILE] (sendfileall): Remove function.
(handle_request): Use writeall instead of sendfileall.
* nscd/aicache.c (addhstaiX): Likewise.
* nscd/grpcache.c (cache_addgr): Likewise.
* nscd/hstcache.c (cache_addhst): Likewise.
* nscd/initgrcache.c (addinitgroupsX): Likewise.
* nscd/netgroupcache.c (addgetnetgrentX, addinnetgrX): Likewise.
* nscd/pwdcache.c (cache_addpw): Likewise.
* nscd/servicescache.c (cache_addserv): Likewise.
* sysdeps/unix/sysv/linux/Makefile [$(subdir) == nscd]
(sysdep-CFLAGS): Remove -DHAVE_SENDFILE.
* sysdeps/unix/sysv/linux/kernel-features.h (__ASSUME_SENDFILE):
Remove define.
[1] http://man7.org/linux/man-pages/man2/sendfile.2.html
[2] 7b6a329977 (diff-efd6af3a70f0f07c578e85b51e83b3c3)
Unlike other nscd caches, the netgroup cache contains two types of
records - those for "iterate through a netgroup" (i.e. setnetgrent())
and those for "is this user in this netgroup" (i.e. innetgr()),
i.e. full and partial records. The timeout code assumes these records
have the same key for the group name, so that the collection of records
that is "this netgroup" can be expired as a unit.
However, the keys are not the same, as the in-netgroup key is generated
by nscd rather than being passed to it from elsewhere, and is generated
without the trailing NUL. All other keys have the trailing NUL, and as
noted in the linked BZ, debug statements confirm that two keys for the
same netgroup are added to the cache with two different lengths.
The result of this is that as records in the cache expire, the purge
code only cleans out one of the two types of entries, resulting in
stale, possibly incorrect, and possibly inconsistent cache data.
The patch simply includes the existing NUL in the computation for the
key length ('key' points to the char after the NUL, and 'group' to the
first char of the group, so 'key-group' includes the first char to the
NUL, inclusive).
[BZ #22342]
* nscd/netgroupcache.c (addinnetgrX): Include trailing NUL in
key value.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ #22161]
* nscd/netgroupcache.c (addinnetgrX): Release read lock after
resetting timeout.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Andreas Schwab <schwab@suse.de>
A lock is held by mempool_allocate() when CACHEABLE is true; we
must release this lock if we exit early.
The netgroups nss modules in the glibc tree use NSS_STATUS_UNAVAIL
(with errno as ERANGE) when the supplied buffer does not have
sufficient space for the result. This is wrong, because the canonical
way to indicate insufficient buffer is to set the errno to ERANGE and
the status to NSS_STATUS_TRYAGAIN, as is used by all other modules.
This fixes nscd behaviour when the nss_ldap module returns
NSS_STATUS_TRYAGAIN to indicate that a netgroup entry is too long to
fit into the supplied buffer.
This patch consolidates the code to initialize the header of a dataset
into a single set of functions (one for positive and another for
negative datasets) primarily to reduce repetition of code. The
secondary reason is to simplify Patch 2/2 which fixes the problem of
an uninitialized byte in the header by initializing an unused field in
the structure and hence preventing a possible data leak into the cache
file.
Calls to stpcpy from nscd netgroups code will have overlapping source
and destination when all three values in the returned triplet are
non-NULL and in the expected (host,user,domain) order. This is seen
in valgrind as:
==3181== Source and destination overlap in stpcpy(0x19973b48, 0x19973b48)
==3181== at 0x4C2F30A: stpcpy (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3181== by 0x12567A: addgetnetgrentX (string3.h:111)
==3181== by 0x12722D: addgetnetgrent (netgroupcache.c:665)
==3181== by 0x11114C: nscd_run_worker (connections.c:1338)
==3181== by 0x4E3C102: start_thread (pthread_create.c:309)
==3181== by 0x59B81AC: clone (clone.S:111)
==3181==
Fix this by using memmove instead of stpcpy.
nscd works correctly when the request in innetgr is a wildcard,
i.e. when one or more of host, user or domain parameters is NULL.
However, it does not work when the the triplet in the netgroup
definition has a wildcard. This is easy to reproduce for a triplet
defined as follows:
foonet (,foo,)
Here, an innetgr call that looks like this:
innetgr ("foonet", "foohost", "foo", NULL);
should succeed and so should:
innetgr ("foonet", NULL, "foo", "foodomain");
It does succeed with nscd disabled, but not with nscd enabled. This
fix adds this additional check for all three parts of the triplet so
that it gives the correct result.
[BZ #16758]
* nscd/netgroupcache.c (addinnetgrX): Succeed if triplet has
blank values.
The buffer to query netgroup entries is allocated sufficient space for
the netgroup entries and the key to be appended at the end, but it
sends in an incorrect available length to the NSS netgroup query
functions, resulting in overflow of the buffer in some special cases.
The fix here is to factor in the key length when sending the available
buffer and buffer length to the query functions.
The _nss_*_getnetgrent_r query populates the netgroup results in the
allocated buffer and then sets the result triplet to point to strings
in the buffer. This is a problem when the buffer is reallocated since
the pointers to the triplet strings are no longer valid. The pointers
need to be adjusted so that they now point to strings in the
reallocated buffer.
addgetnetgrentX has a buffer which is grown as per the needs of the
requested size either by using alloca or by falling back to malloc if
the size is larger than 1K. There are two problems with the alloca
bits: firstly, it doesn't really extend the buffer since it does not
use the return value of the extend_alloca macro, which is the location
of the reallocated buffer. Due to this the buffer does not actually
extend itself and hence a subsequent write may overwrite stuff on the
stack.
The second problem is more subtle - the buffer growth on the stack is
discontinuous due to block scope local variables. Combine that with
the fact that unlike realloc, extend_alloca does not copy over old
content and you have a situation where the buffer just has garbage in
the space where it should have had data.
This could have been fixed by adding code to copy over old data
whenever we call extend_alloca, but it seems unnecessarily
complicated. This code is not exactly a performance hotspot (it's
called when there is a cache miss, so factors like network lookup or
file reads will dominate over memory allocation/reallocation), so this
premature optimization is unnecessary.
Thanks Brad Hubbard <bhubbard@redhat.com> for his help with debugging
the problem.
nscd incorrectly returns a success even when the netgroup in question
is not found and adds a positive result in the cache. this patch
fixes this behaviour by adding a negative lookup entry to cache and
returning an error when the netgroup is not found.
Currently, when a user looks up a netgroup that does not have any
members, nscd goes into an infinite loop trying to find members in the
group. This is because it does not handle cases when getnetgrent
returns an NSS_STATUS_NOTFOUND (which is what it does on empty group).
Fixed to handle this in the same way as NSS_STATUS_RETURN, similar to
what getgrent does by itself.