In several converters, a __GCONV_ILLEGAL_INPUT result gets overwritten
with __GCONV_FULL_OUTPUT. As a result, iconv (the function) returns
E2BIG instead of EILSEQ. The iconv program does not see the original
EILSEQ failure, does not recognize the invalid input, and may
incorrectly exit successfully.
To address this, a new __flags bit is used to indicate a sticky input
error state. All __GCONV_ILLEGAL_INPUT results are replaced with a
function call that sets this new __GCONV_ENCOUNTERED_ILLEGAL_INPUT and
returns __GCONV_ILLEGAL_INPUT. The iconv program checks for
__GCONV_ENCOUNTERED_ILLEGAL_INPUT and overrides the exit status.
The converter changes introducing __gconv_mark_illegal_input are
mostly mechanical, except for the res variable initialization in
iconvdata/iso-2022-jp.c: this error gets overwritten with __GCONV_OK
and other results in the following code. If res ==
__GCONV_ILLEGAL_INPUT afterwards, STANDARD_TO_LOOP_ERR_HANDLER below
will handle it.
The __gconv_mark_illegal_input changes do not alter the errno value
set by the iconv function. This is simpler to implement than
reviewing each __GCONV_FULL_OUTPUT result and adjust it not to
override a previous __GCONV_ILLEGAL_INPUT result. Doing it that way
would also change some E2BIG errors in to EILSEQ errors, so it had to
be done conditionally (under a flag set by the iconv program only), to
avoid confusing buffer management in other applications.
Reviewed-by: DJ Delorie <dj@redhat.com>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dchttps://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
char array resp. pointer.
* iconvdata/iso-2022-kr.c (BODY): Make buf unsigned char instead of
char array.
* iconvdata/cns11643.h (cns11643_to_ucs4): Change first argument
to const unsigned char **.
(ucs4_to_cns11643): Change second argument to unsigned char *.
* iconvdata/euc-tw.c (BODY): Change endp type to
const unsigned char *.
* iconvdata/iso-ir-165.h (ucs4_to_isoir165): Change second argument
to unsigned char *.
* iconvdata/ibm1008_420.c (LOOP_NEED_FLAGS): Don't define.
* iconvdata/iso-2022-cn.c (BODY): Change buf to unsigned char array.
* iconvdata/iso-2022-cn-ext.c (BODY): Change buf, tmpbuf, tmp
types to unsigned char pointers/arrays instead of char.
* iconvdata/jis0201.h (ucs4_to_jisx0201): Change second argument
to unsigned char *.
* iconvdata/jis0208.h (ucs4_to_jisx0208): Likewise.
* iconvdata/jis0212.h: Include assert.h.
(ucs4_to_jisx0212): Change second argument to unsigned char *.
assert that if cp[0] is not '\0', cp[1] is not '\0' either instead
of trying to handle that.
* iconvdata/euc-kr.c (euckr_from_ucs4): Initialize also cp[1] to
shut up a warning.
* iconvdata/euc-jp-ms.c (from_ucs4_lat1, from_ucs4_greek,
from_ucs4_cjk, from_ucs4_cjkcpt, from_ucs4_extra): Change type to
two dimensional const unsigned char arrays.
(BODY): Cast "" to (const unsigned char *) for assignment to cp.
Initialize endp to inptr to shut up a warning.
2003-08-14 Ulrich Drepper <drepper@redhat.com>
* iconvdata/cp932.c: Fixed checking of a few border of code areas.
Changed conversion of JIS X 0201 from using a table to calculating.
* iconvdata/euc-jp-ms.c: Fixed conversion table and rewrote
conversion routine. Changed CHARSET_NAME definition from EUCJP-MS to
EUC-JP-MS.
* iconvdata/tst-tables.sh: Add CP932 and EUC-JP-MS.
* iconvdata/CP932.irreversible: New file.
* iconvdata/EUC-JP-MS.irreversible: New file.
Patch by MORIYAMA Masayuki <msyk@mtg.biglobe.ne.jp>.