The codecvt vtable is not a real vtable because it also contains the
conversion state data. Furthermore, wide stream support was added to
GCC 3.0, after a C++ ABI bump, so there is no compatibility
requirement with libstdc++.
This change removes several unmangled function pointers which could
be used with a corrupted FILE object to redirect execution. (libio
vtable verification did not cover the codecvt vtable.)
Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
When computing the length of the converted part of the stdio buffer, use
the number of consumed wide characters, not the (negative) distance to the
end of the wide buffer.
GLIBC explicitly allows one to assign a new FILE pointer to stdout and
other standard streams. printf and wprintf were honouring assignment to
stdout and using the new value, but puts, putchar, and wide char variants
did not.
The stdout part is fixed here. The stdin part will be fixed in a followup.
C99 specifies that the EOF condition on a file is "sticky": once EOF
has been encountered, all subsequent reads should continue to return
EOF until the file is closed or something clears the "end-of-file
indicator" (e.g. fseek, clearerr). This is arguably a change from
C89, where the wording was ambiguous; the BSDs always had sticky EOF,
but the System V lineage would attempt to read from the underlying fd
again. GNU libc has followed System V for as long as we've been
using libio, but nowadays C99 conformance and BSD compatibility are
more important than System V compatibility.
You might wonder if changing the _underflow impls is sufficient to
apply the C99 semantics to all of the many stdio functions that
perform input. It should be enough to cover all paths to _IO_SYSREAD,
and the only other functions that call _IO_SYSREAD are the _seekoff
impls, which is OK because seeking clears EOF, and the _xsgetn impls,
which, as far as I can tell, are unused within glibc.
The test programs in this patch use a pseudoterminal to set up the
necessary conditions. To facilitate this I added a new test-support
function that sets up a pair of pty file descriptors for you; it's
almost the same as BSD openpty, the only differences are that it
allocates the optionally-returned tty pathname with malloc, and that
it crashes if anything goes wrong.
[BZ #1190]
[BZ #19476]
* libio/fileops.c (_IO_new_file_underflow): Return EOF immediately
if the _IO_EOF_SEEN bit is already set; update commentary.
* libio/oldfileops.c (_IO_old_file_underflow): Likewise.
* libio/wfileops.c (_IO_wfile_underflow): Likewise.
* support/support_openpty.c, support/tty.h: New files.
* support/Makefile (libsupport-routines): Add support_openpty.
* libio/tst-fgetc-after-eof.c, wcsmbs/test-fgetwc-after-eof.c:
New test cases.
* libio/Makefile (tests): Add tst-fgetc-after-eof.
* wcsmbs/Makefile (tests): Add tst-fgetwc-after-eof.
This patch eliminates a number of #if 0 and #ifdef TODO blocks, macros
that are never used, macros that provide portability to substrates that
lack basic things like EINVAL and off_t, and other such debris.
I preserved IO_DEBUG and CHECK_FILE, even though as far as I can tell
IO_DEBUG is never defined and therefore CHECK_FILE never does
anything, because it seems like we might actually want to turn it _on_.
Installed stripped libraries and executables are unchanged, except,
again, that the line number of an assertion changes (this time it's
somewhere in fileops.c).
* libio/libio.h (_IO_pos_BAD, _IO_pos_0, _IO_pos_adjust):
Define here, unconditionally.
* libio/iolibio.h (_IO_pos_BAD): Don't define here.
* libio/libioP.h: Remove #if 0 blocks.
(_IO_pos_BAD, _IO_pos_0, _IO_pos_adjust): Don't define here.
(_IO_va_start, COERCE_FILE, MAYBE_SET_EINVAL): Don't define.
(CHECK_FILE): Don't use MAYBE_SET_EINVAL or COERCE_FILE. Fix style.
* libio/clearerr.c, libio/fputc.c, libio/getchar.c:
Assume weak_alias is always defined.
* libio/fileops.c, libio/genops.c, libio/oldfileops.c
* libio/oldpclose.c, libio/pclose.c, libio/wfileops.c:
Remove #if 0 and #ifdef TODO blocks.
Assume text_set_element is always defined.
* libio/iofdopen.c, libio/iogetdelim.c, libio/oldiofdopen.c
Use __set_errno (EINVAL) instead of MAYBE_SET_EINVAL.
* libio/tst-mmap-eofsync.c: Make #if 1 block unconditional.
This patch mechanically removes all remaining uses, and the
definitions, of the following libio name aliases:
name replaced with
---- -------------
_IO_FILE FILE
_IO_fpos_t __fpos_t
_IO_fpos64_t __fpos64_t
_IO_size_t size_t
_IO_ssize_t ssize_t or __ssize_t
_IO_off_t off_t
_IO_off64_t off64_t
_IO_pid_t pid_t
_IO_uid_t uid_t
_IO_wint_t wint_t
_IO_va_list va_list or __gnuc_va_list
_IO_BUFSIZ BUFSIZ
_IO_cookie_io_functions_t cookie_io_functions_t
__io_read_fn cookie_read_function_t
__io_write_fn cookie_write_function_t
__io_seek_fn cookie_seek_function_t
__io_close_fn cookie_close_function_t
I used __fpos_t and __fpos64_t instead of fpos_t and fpos64_t because
the definitions of fpos_t and fpos64_t depend on the largefile mode.
I used __ssize_t and __gnuc_va_list in a handful of headers where
namespace cleanliness might be relevant even though they're
internal-use-only. In all other cases, I used the public-namespace
name.
There are a tiny handful of places where I left a use of 'struct _IO_FILE'
alone, because it was being used together with 'struct _IO_FILE_plus'
or 'struct _IO_FILE_complete' in the same arithmetic expression.
Because this patch was almost entirely done with search and replace, I
may have introduced indentation botches. I did proofread the diff,
but I may have missed something.
The ChangeLog below calls out all of the places where this was not a
pure search-and-replace change.
Installed stripped libraries and executables are unchanged by this patch,
except that some assertions in vfscanf.c change line numbers.
* libio/libio.h (_IO_FILE): Delete; all uses changed to FILE.
(_IO_fpos_t): Delete; all uses changed to __fpos_t.
(_IO_fpos64_t): Delete; all uses changed to __fpos64_t.
(_IO_size_t): Delete; all uses changed to size_t.
(_IO_ssize_t): Delete; all uses changed to ssize_t or __ssize_t.
(_IO_off_t): Delete; all uses changed to off_t.
(_IO_off64_t): Delete; all uses changed to off64_t.
(_IO_pid_t): Delete; all uses changed to pid_t.
(_IO_uid_t): Delete; all uses changed to uid_t.
(_IO_wint_t): Delete; all uses changed to wint_t.
(_IO_va_list): Delete; all uses changed to va_list or __gnuc_va_list.
(_IO_BUFSIZ): Delete; all uses changed to BUFSIZ.
(_IO_cookie_io_functions_t): Delete; all uses changed to
cookie_io_functions_t.
(__io_read_fn): Delete; all uses changed to cookie_read_function_t.
(__io_write_fn): Delete; all uses changed to cookie_write_function_t.
(__io_seek_fn): Delete; all uses changed to cookie_seek_function_t.
(__io_close_fn): Delete: all uses changed to cookie_close_function_t.
* libio/iofopncook.c: Remove unnecessary forward declarations.
* libio/iolibio.h: Correct outdated commentary.
* malloc/malloc.c (__malloc_stats): Remove unnecessary casts.
* stdio-common/fxprintf.c (__fxprintf_nocancel):
Remove unnecessary casts.
* stdio-common/getline.c: Use _IO_getdelim directly.
Don't redefine ssize_t.
* stdio-common/printf_fp.c, stdio_common/printf_fphex.c
* stdio-common/printf_size.c: Don't redefine size_t or FILE.
Remove outdated comments.
* stdio-common/vfscanf.c: Don't redefine va_list.
Some libio operations fail to correctly free the backup area (created
by _IO_{w}default_pbackfail on unget{w}c) resulting in either invalid
buffer free operations or memory leaks.
For instance, on the example provided by BZ#22415 a following
fputc after a fseek to rewind the stream issues an invalid free on
the buffer. It is because although _IO_file_overflow correctly
(from fputc) correctly calls _IO_free_backup_area, the
_IO_new_file_seekoff (called by fseek) updates the FILE internal
pointers without first free the backup area (resulting in invalid
values in the internal pointers).
The wide version also shows an issue, but instead of accessing invalid
pointers it leaks the backup memory on fseek/fputwc operation.
Checked on x86_64-linux-gnu and i686-linux-gnu.
* libio/Makefile (tests): Add tst-bz22415.
(tst-bz22415-ENV): New rule.
(generated): Add tst-bz22415.mtrace and tst-bz22415.check.
(tests-special): Add tst-bz22415-mem.out.
($(objpfx)tst-bz22415-mem.out): New rule.
* libio/fileops.c (_IO_new_file_seekoff): Call _IO_free_backup_area
in case of a successful seek operation.
* libio/wfileops.c (_IO_wfile_seekoff): Likewise.
(_IO_wfile_overflow): Call _IO_free_wbackup_area in case a write
buffer is required.
* libio/tst-bz22415.c: New test.
This commit puts all libio vtables in a dedicated, read-only ELF
section, so that they are consecutive in memory. Before any indirect
jump, the vtable pointer is checked against the section boundaries,
and the process is terminated if the vtable pointer does not fall into
the special ELF section.
To enable backwards compatibility, a special flag variable
(_IO_accept_foreign_vtables), protected by the pointer guard, avoids
process termination if libio stream object constructor functions have
been called earlier. Such constructor functions are called by the GCC
2.95 libstdc++ library, and this mechanism ensures compatibility with
old binaries. Existing callers inside glibc of these functions are
adjusted to call the original functions, not the wrappers which enable
vtable compatiblity.
The compatibility mechanism is used to enable passing FILE * objects
across a static dlopen boundary, too.
Both of "_IO_UNBUFFERED" and "_IO_LINE_BUF" are the bit flags, but I
find there are some codes looks like "_IO_LINE_BUF+_IO_UNBUFFERED",
while some codes are "_IO_LINE_BUF|_IO_UNBUFFERED".
I think the former is not good, even though the final result is same.
POSIX allows applications to switch file handles when a read results
in an end of file. Unset the cached offset at this point so that it
is queried again.
Currently we seek to end of file if there are unflushed writes or the
stream is in write mode, to get the current offset for writing in
append mode, which is the end of file. The latter case (i.e. stream
is in write mode, but no unflushed writes) is unnecessary since it
will only happen when the stream has just been flushed, in which case
the recorded offset ought to be reliable.
Removing that case lets ftell give the correct offset when it follows
an ftruncate. The latter truncates the file, but does not change the
file position, due to which it is permissible to call ftell without an
intervening fseek call.
Tested on x86_64 to verify that the added test case fails without the
patch and succeeds with it, and that there are no additional
regressions due to it.
[BZ #17647]
* libio/fileops.c (do_ftell): Seek only when there are
unflushed writes.
* libio/wfileops.c (do_ftell_wide): Likewise.
* libio/tst-ftell-active-handler.c (do_ftruncate_test): New
test case.
(do_one_test): Call it.
The offset computation in write mode uses the fact that _IO_read_end
is kept in sync with the external file offset. This however is not
true when O_APPEND is in effect since switching to write mode ought to
send the external file offset to the end of file without making the
necessary adjustment to _IO_read_end.
Hence in append mode, offset computation when writing should only
consider the effect of unflushed writes, i.e. from _IO_write_base to
_IO_write_ptr.
The wiki has a detailed document that describes the rationale for
offsets returned by ftell in various conditions:
https://sourceware.org/glibc/wiki/File%20offsets%20in%20a%20stdio%20stream%20and%20ftell
The ftell implementation was made conservative to ensure that
incorrectly cached offsets never affect it. However, this causes
problems for append mode when a file stream is rewound. Additionally,
the 'clever' trick of using stat to get position for append mode files
caused more problems than it solved and broke old behavior. I have
described the various problems that it caused and then finally the
solution.
For a and a+ mode files, rewinding the stream should result in ftell
returning 0 as the offset, but the stat() trick caused it to
(incorrectly) always return the end of file. Now I couldn't find
anything in POSIX that specifies the stream position after rewind()
for a file opened in 'a' mode, but for 'a+' mode it should be set to
0. For 'a' mode too, it probably makes sense to keep it set to 0 in
the interest of retaining old behavior.
The initial file position for append mode files is implementation
defined, so the implementation could either retain the current file
position or move the position to the end of file. The earlier ftell
implementation would move the offset to end of file for append-only
mode, but retain the old offset for a+ mode. It would also cache the
offset (this detail is important). My patch broke this and would set
the initial position to end of file for both append modes, thus
breaking old behavior. I was ignorant enough to write an incorrect
test case for it too.
The Change:
I have now brought back the behavior of seeking to end of file for
append-only streams, but with a slight difference. I don't cache the
offset though, since we would want ftell to query the current file
position through lseek while the stream is not active. Since the
offset is moved to the end of file, we can rely on the file position
reported by lseek and we don't need to resort to the stat() nonsense.
Finally, the cache is always reliable, except when there are unflished
writes in an append mode stream (i.e. both a and a+). In the latter
case, it is safe to just do an lseek to SEEK_END. The value can be
safely cached too, since the file handle is already active at this
point. Incidentally, this is the only state change we affect in the
file handle (apart from taking locks of course).
I have also updated the test case to correct my impression of the
initial file position for a+ streams to the initial behavior. I have
verified that this does not break any existing tests in the testsuite
and also passes with the new tests.
The cached offset is reliable to use in ftell when the stream handle
is active. We can consider a stream as being active when there is
unflushed data. However, even in this case, we can use the cached
offset only when the stream is not being written to in a+ mode,
because this case may have unflushed data and a stale offset; the
previous read could have sent it off somewhere other than the end of
the file.
There were a couple of adjustments necessary to get this to work.
Firstly, fdopen now ceases to use _IO_attach_fd because it sets the
offset cache to the current file position. This is not correct
because there could be changes to the file descriptor before the
stream handle is activated, which would not get reflected.
A similar offset caching action is done in _IO_fwide, claiming that
wide streams have 'problems' with the file offsets. There don't seem
to be any obvious problems with not having the offset cache available,
other than that it will have to be queried in a subsequent
read/write/seek. I have removed this as well.
The testsuite passes successfully with these changes on x86_64.
ftell semantics are distinct from fseek(SEEK_CUR) especially when it
is called on a file handler that is not yet active. Due to this
caveat, much care needs to be taken while modifying the handler data
and hence, this first iteration on separating out ftell focusses on
maintaining handler data integrity at all times while it figures out
the current stream offset. The result is that it makes a syscall for
every offset request.
There is scope for optimizing this by caching offsets when we know
that the handler is active. A simple way to find out is when the
buffers have data. It is not so simple to find this out when the
buffer is empty without adding some kind of flag.
ftell tries to avoid flushing the buffer when it is in write mode by
converting the wide char data and placing it into the binary buffer.
If the output buffer space is full and there is data to write, the
code reverts to flushing the buffer. This breaks when there is space
in the buffer but it is not enough to convert the next character in
the wide data buffer, due to which __codecvt_do_out returns a
__codecvt_partial status. In this case, ftell keeps running in an
infinite loop.
The fix here is to detect the __codecvt_partial status in addition to
checking if the buffer is full. I have also added a test case that
demonstrates the infinite loop.
[BZ #14543]
Set the internal buffer state correctly whenever the external buffer
state is modified by fseek by either computing the current
_IO_read_ptr/end for the internal buffer based on the new _IO_read_ptr
in the external buffer or converting the content read into the
external buffer, up to the extent of the requested fseek offset.
* libio/wfileops.c (_IO_wfile_underflow): Fix handling of
incomplete characters at end of input buffer.
* libio/Makefile (tests): Add tst-fgetwc.
* libio/tst-fgetwc.c: New file.
* libio/tst-fgetwc.input: New file.