glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-26 04:31:03 +00:00

Author	SHA1	Message	Date
Szabolcs Nagy	cef9089a68	cheri: Fix capability permissions of PROT_NONE map in locarchive	2022-10-26 15:39:59 +01:00
Szabolcs Nagy	c3d2d246c5	static: glibc-bug: NL_CURRENT_INDIRECT is broken so disable it nl_langinfo_l ignores its locale argument with NL_CURRENT_INDIRECT which is wrong when that argument does not match the current thread's locale. upstream glibc is not tested with static linking so this is not found.	2022-10-12 14:22:03 +01:00
Adhemerval Zanella	6c4ed247bf	locale: Optimize tst-localedef-path-norm The locale generation are issues in parallel to try speed locale generation. The maximum number of jobs are limited to the online CPU (in hope to not overcommit on environments with lower cores than tests). On a Ryzen 9, the test execution improves from ~6.7s to ~1.4s. Tested-by: Mark Wielaard <mark@klomp.org>	2022-07-22 11:06:16 -03:00
Florian Weimer	9d77023bf3	localedef: Support building for older C standards Fixes commit `b15538d77c` ("locale: localdef input files are now encoded in UTF-8").	2022-07-05 10:30:20 +02:00
Florian Weimer	b15538d77c	locale: localdef input files are now encoded in UTF-8 Previously, they were assumed to be in ISO-8859-1, and that the output charset overlapped with ISO-8859-1 for the characters actually used. However, this did not work as intended on many architectures even for an ISO-8859-1 output encoding because of the char signedness bug in lr_getc. Therefore, this commit switches to UTF-8 without making provisions for backwards compatibility. The following Elisp code can be used to convert locale definition files to UTF-8: (defun glibc/convert-localedef (from to) (interactive "r") (save-excursion (save-restriction (narrow-to-region from to) (goto-char (point-min)) (save-match-data (while (re-search-forward "<U\$[0-9a-fA-F]+\$>" nil t) (let* ((codepoint (string-to-number (match-string 1) 16)) (converted (cond ((memq codepoint '(?/ ?\ ?< ?>)) (string ?/ codepoint)) ((= codepoint ?\") "<U0022>") (t (string codepoint))))) (replace-match converted t))))))) Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 09:06:50 +02:00
Florian Weimer	7dcaabb94c	locale: Introduce translate_unicode_codepoint into linereader.c This will permit reusing the Unicode character processing for different character encodings, not just the current <U...> encoding. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 09:06:39 +02:00
Florian Weimer	19d4944459	locale: Fix signed char bug in lr_getc The array lr->buf contains characters, which can be signed. A 0xff byte in the input could be incorrectly reported as EOF. More importantly, get_string in linereader.c converts a signed input byte to a Unicode code point using ADDWC ((uint32_t) ch), under the assumption that this decodes the ISO-8859-1 input encoding. If char is signed, this does not give the correct result. This means that ISO-8859-1 input files for localedef are not actually supported, contrary to the comment in get_string. This is a happy accident because we can therefore change the file encoding to UTF-8 without impacting backwards compatibility. While at it, remove the \32 check for MS-DOS end-of-file character (^Z). Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 09:06:28 +02:00
Florian Weimer	5dcbff5879	locale: Turn ADDC and ADDS into functions in linereader.c And introduce struct lr_buffer. The functions addc and adds can be called from functions, enabling subsequent refactoring. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2022-07-05 09:06:15 +02:00
Florian Weimer	93ec1cf0fe	locale: Add more cached data to LC_CTYPE This data will be used in number formatting. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	7ee41feba6	locale: Remove private union from struct __locale_data This avoids an alias violation later. This commit also fixes an incorrect double-checked locking idiom in _nl_init_era_entries. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	bbebe83a28	locale: Remove cleanup function pointer from struct __localedata We can call the cleanup functions directly from _nl_unload_locale if we pass the category to it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Florian Weimer	0b6342e769	locale: Call _nl_unload_locale from _nl_archive_subfreeres The function performs the same steps for ld_archive locales (mapped from an archive), and this code is not performance-critical, so the specialization does not add value. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2022-05-23 11:06:31 +02:00
Noah Goldstein	535e935a28	Replace {u}int_fast{16\|32} with {u}int32_t On 32-bit machines this has no affect. On 64-bit machines {u}int_fast{16\|32} are set as {u}int64_t which is often not ideal. Particularly x86_64 this change both saves code size and may save instruction cost. Full xcheck passes on x86_64.	2022-04-13 21:23:04 -05:00
Ilyahoo Proshel	189906b687	Add rif_MA locale [BZ #27781 ] Resolves: BZ #27781	2022-04-07 14:59:41 +02:00
Adhemerval Zanella	c5c65de1b2	locale: Remove set but unused variable on ld-collate.c Checked on x86_64-linux-gnu and i686-linux-gnu.	2022-03-31 08:53:40 -03:00
Adhemerval Zanella	41397b9337	locale: Remove ununsed wctype_table_get function	2022-03-23 14:32:23 -03:00
Carlos O'Donell	70f021e66a	Define ISO 639-3 "tok" [BZ #28950 ] Effective 2022-01-20 via SIL request 2021-043 the identifier "tok" is now active for Toki Pona in the code set for ISO 639-3. References: https://iso639-3.sil.org/code/tok https://iso639-3.sil.org/sites/iso639-3/files/change_requests/2021/2021-043.pdf No regressions on x86_64.	2022-03-14 08:44:00 -04:00
Carlos O'Donell	2ab8b74567	localedef: Update LC_MONETARY handling (Bug 28845) ISO C17, POSIX Issue 7, and ISO 30112 all allow the char* types to be empty strings i.e. "", integer or char values to be -1 or CHAR_MAX respectively, with the exception of decimal_point which must be non-empty in ISO C. Note that the defaults for mon_grouping vary, but are functionaly equivalent e.g. "\177" (no further grouping reuqired) vs. "" (no grouping defined for all groups). We include a broad comment talking about harmonizing ISO C, POSIX, ISO 30112, and the default C/POSIX locale for glibc. We reorder all setting based on locale/categories.def order. We soften all missing definitions from errors to warnings when defaults exist. Given that ISO C, POSIX and ISO 30112 allow the empty string we change LC_MONETARY handling of mon_decimal_point to allow the empty string. If mon_decimal_point is not defined at all then we pick the existing legacy glibc default value of <U002E> i.e. ".". We also set the default for mon_thousands_sep_wc at the same time as mon_thousands_sep, but this is not a change in behaviour, it is always either a matching value or L'\0', but if in the future we change the default to a non-empty string we would need to update both at the same time. Tested on x86_64 and i686 without regressions. Tested with install-locale-archive target. Tested with install-locale-files target. Reviewed-by: DJ Delorie <dj@redhat.com>	2022-02-25 07:31:27 -05:00
Arjun Shankar	ea89d5bbd9	localedef: Handle symbolic links when generating locale-archive Whenever locale data for any locale included symbolic links, localedef would throw the error "incomplete set of locale files" and exclude it from the generated locale archive. This commit fixes that. Co-authored-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2022-02-24 22:45:05 +01:00
Carlos O'Donell	1d8e3a2c66	localedef: Fix handling of empty mon_decimal_point (Bug 28847) The handling of mon_decimal_point is incorrect when it comes to handling the empty "" value. The existing parser in monetary_read() will correctly handle setting the non-wide-character value and the wide-character value e.g. STR_ELEM_WC(mon_decimal_point) if they are set in the locale definition. However, in monetary_finish() we have conflicting TEST_ELEM() which sets a default value (if the locale definition doesn't include one), and subsequent code which looks for mon_decimal_point to be NULL to issue a specific error message and set the defaults. The latter is unused because TEST_ELEM() always sets a default. The simplest solution is to remove the TEST_ELEM() check, and allow the existing check to look to see if mon_decimal_point is NULL and set an appropriate default. The final fix is to move the setting of mon_decimal_point_wc so it occurs only when mon_decimal_point is being set to a default, keeping both values consistent. There is no way to tell the difference between mon_decimal_point_wc having been set to the empty string and not having been defined at all, for that distinction we must use mon_decimal_point being NULL or "", and so we must logically set the default together with mon_decimal_point. Lastly, there are more fixes similar to this that could be made to ld-monetary.c, but we avoid that in order to fix just the code required for mon_decimal_point, which impacts the ability for C.UTF-8 to set mon_decimal_point to "", since without this fix we end up with an inconsistent setting of mon_decimal_point set to "", but mon_decimal_point_wc set to "." which is incorrect. Tested on x86_64 and i686 without regression. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2022-02-01 11:12:17 -05:00
Paul Eggert	b92a49359f	Update automatically-generated copyright dates These were updated simply by running "make" to regen the files.	2022-01-01 11:42:26 -08:00
Paul Eggert	634b5ebac6	Update copyright dates not handled by scripts/update-copyrights. I've updated copyright dates in glibc for 2022. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files. As well as the usual annual updates, mainly dates in --version output (minus csu/version.c which previously had to be handled manually but is now successfully updated by update-copyrights), there is a small change to the copyright notice in NEWS which should let NEWS get updated automatically next year. Please remember to include 2022 in the dates for any new files added in future (which means updating any existing uncommitted patches you have that add new files to use the new copyright dates in them).	2022-01-01 11:42:26 -08:00
Paul Eggert	581c785bf3	Update copyright dates with scripts/update-copyrights I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: * 912-#endif remote: * 913: remote: * 914- remote: * error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines	2022-01-01 11:40:24 -08:00
Aurelien Jarno	cbab7f7268	localedef: check magic value on archive load [BZ #28650 ] localedef currently blindly trust the archive header. When passed an archive file with the wrong endianess, this leads to a segmentation fault: $ localedef --big-endian --list-archive /usr/lib/locale/locale-archive Segmentation fault (core dumped) When passed non-archive files, asserts are reported on the best case, but sometimes it can lead to a segmentation fault: $ localedef --list-archive /bin/true localedef: programs/locarchive.c:1643: show_archive_content: Assertion `used < GET (head->namehash_used)' failed. Aborted (core dumped) $ localedef --list-archive /usr/lib/locale/C.utf8/LC_COLLATE Segmentation fault (core dumped) This patch improves the user experience by looking at the magic value, which is always written, but never checked. It should still be possible to trigger a segmentation fault with crafted files, but this already catch many cases.	2021-12-07 23:32:53 +01:00
Florian Weimer	b8c6166b1b	locale: Add missing second argument to _Static_assert in C-collate-seq.c	2021-09-06 19:43:37 +02:00
Carlos O'Donell	f5117c6504	Add 'codepoint_collation' support for LC_COLLATE. Support a new directive 'codepoint_collation' in the LC_COLLATE section of a locale source file. This new directive causes all collation rules to be dropped and instead STRCMP (strcmp or wcscmp) is used for collation of the input character set. This is required to allow for a C.UTF-8 that contains zero collation rules (minimal size) and sorts using code point sorting. To date the only implementation of a locale with zero collation rules is the C/POSIX locale. The C/POSIX locale provides identity tables for _NL_COLLATE_COLLSEQMB and _NL_COLLATE_COLLSEQWC that map to ASCII even though it has zero rules. This has lead to existing fnmatch, regexec, and regcomp implementations that require these tables. It is not correct to use these tables when nrules == 0, but the conservative fix is to provide these tables when nrules == 0. This assures that existing static applications using a new C.UTF-8 locale with 'codepoint_collation' at least have functional range expressions with ASCII e.g. [0-9] or [a-z]. Such static applications would not have the fixes to fnmatch, regexec and regcomp that avoid the use of the tables when nrules == 0. Future fixes to fnmatch, regexec, and regcomp would allow range expressions to use the full set of code points for such ranges. Tested on x86_64 and i686 without regression. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2021-09-06 11:06:45 -04:00
Siddhesh Poyarekar	30891f35fa	Remove "Contributed by" lines We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2021-09-03 22:06:44 +05:30
Adhemerval Zanella	52a5fe70a2	Use 64 bit time_t stat internally For the legacy ABI with supports 32-bit time_t it calls the 64-bit time directly, since the LFS symbols calls the 64-bit time_t ones internally. Checked on i686-linux-gnu and x86_64-linux-gnu. Reviewed-by: Lukasz Majewski <lukma@denx.de>	2021-06-22 12:09:52 -03:00
Siddhesh Poyarekar	2317101658	show_archive_content: Fix trivial memory leak Fix trivial leak identified by coverity. The program runs to exit and the leak doesn't grow, but it's just cleaner to free the allocated memory. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-05-18 09:07:06 +05:30
Siddhesh Poyarekar	213573f86e	write_archive_locales: Fix memory leak Fix memory leak identified by coverity.	2021-05-11 17:57:30 +05:30
Siddhesh Poyarekar	1d25bd274c	get-translit.py: Fix typo	2021-05-11 12:55:45 +05:30
Lirong Yuan	7b414d6e7b	locale: Align _nl_C_LC_CTYPE_class and _nl_C_LC_CTYPE_class32 Otherwise, programs that use character classification macros such as isspace may observe unaligned pointers.	2021-05-03 16:10:18 +02:00
Hanataka Shinya	82292c99b2	LC_COLLATE: Fix last character ellipsis handling (Bug 22668) During ellipsis processing the collation cursor was not correctly moved to the end of the ellipsis after processing. The code inserted the new entry after the cursor, but before the real end of the ellipsis: [cursor] ... element_t <-> element_t <-> element_t <-> element_t "<U0000>" "<U0001>" "<U007F>" startp endp At the end of the function we have: [cursor] ... element_t <-> element_t <-> element_t "<U007E>" "<U007F>" endp The cursor should be pointing at endp, the last element in the doubly-linked list, otherwise when execution returns to the caller we will start inserting the next line after <U007E>. Subsequent operations end up unlinking the ellipsis end entry or just leaving it in the list dangling from the end. This kind of dangling is immediately visible in C.UTF-8 with the following sorting from strcoll: <U0010FFFF> <U0000FFFF> <U000007FF> <U0000007F> With the cursor correctly adjusted the end entry is correctly given the right location and thus the right weight. Retested and no regressions on x86_64 and i686. Co-authored-by: Carlos O'Donell <carlos@redhat.com>	2021-04-26 08:03:32 -04:00
Florian Weimer	6d8fcee694	locale: Use compat_symbol_reference in _nl_postload_ctype These symbol usages are not definitions, so compat_symbol_reference is more appropriate than compat_symbol. compat_symbol_reference is also safe to emit multiple times (in case the inline assembly is duplicated; this is possible because it is nested in a function). compat_symbol does not necessarily have this property because it is intended to provide a symbol definition. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-03-25 11:06:58 +01:00
Paul Eggert	82cfac84c7	Update automatically-generated copyright dates These were updated simply by running "make" to regen the files.	2021-01-02 12:17:34 -08:00
Paul Eggert	9fcdec7386	Update copyright dates not handled by scripts/update-copyrights. I've updated copyright dates in glibc for 2021. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files. As well as the usual annual updates, mainly dates in --version output (minus csu/version.c which previously had to be handled manually but is now successfully updated by update-copyrights), there is a small change to the copyright notice in NEWS which should let NEWS get updated automatically next year. Please remember to include 2021 in the dates for any new files added in future (which means updating any existing uncommitted patches you have that add new files to use the new copyright dates in them).	2021-01-02 12:17:34 -08:00
Paul Eggert	2b778ceb40	Update copyright dates with scripts/update-copyrights I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: * pre-commit check failed ... remote: * error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master	2021-01-02 12:17:34 -08:00
Dmitry V. Levin	14ef9c185b	treewide: fix incorrect spelling of indices in comments Replace 'indeces' with 'indices', the most annoying of these typos were those found in elf.h which is a public header file copied to other projects.	2020-12-11 02:00:00 +00:00
Joseph Myers	5c3b0374eb	Do not use array parameter to new_composite_name (bug 26726) Among the warnings causing a glibc build with GCC 11 to fail is one for a call new_composite_name in setlocale.c. The newnames argument is declared as an array with __LC_LAST elements, but when the category argument is not LC_ALL, it actually only has one element. Since the number of elements depends on the first argument to the function, it seems clearer to declare the argument as a pointer. Tested with build-many-glibcs.py for arm-linux-gnueabi, where this allows the build to get further. Reviewed-by: DJ Delorie <dj@redhat.com>	2020-10-30 21:39:12 +00:00
Adhemerval Zanella	04986243d1	Remove internal usage of extensible stat functions It replaces the internal usage of __{f,l}xstat{at}{64} with the __{f,l}stat{at}{64}. It should not change the generate code since sys/stat.h explicit defines redirections to internal calls back to xstat* symbols. Checked with a build for all affected ABIs. I also check on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Lukasz Majewski <lukma@denx.de>	2020-09-11 14:35:32 -03:00
Florian Weimer	981e638d38	locale: Add transliteration for Geresh, Gershayim (U+05F3, U+05F4) ISO-8859-8-based locales will need this transliteration if the locale files contain these Unicode characters. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2020-05-15 10:38:10 +02:00
Carlos O'Donell	6f0baacf0f	locale/tst-localedef-path-norm: Don't create $(complocaledir) We automatically create $(complocaledir) in the testroot.root now and so we don't need to create it in the test. Tested on x86_64. Reviewed-by: DJ Delorie <dj@redhat.com>	2020-04-30 16:28:07 -04:00
Szabolcs Nagy	d96cb37678	Increase the timeout of locale/tst-localedef-path-norm On the current AArch64 buildbot the default 20s timeout is not enough to run this test, it seems make test t=locale/tst-localedef-path-norm takes about 25s, so i increased the timeout to 30s.	2020-04-27 16:20:29 +01:00
Carlos O'Donell	99de869beb	Use 2020 as copyright year. Use the year 2020 for files added by commit: `92954ffa5a`	2020-04-27 10:34:52 -04:00
Carlos O'Donell	92954ffa5a	localedef: Add verbose messages for failure paths. During testing of localedef running in a minimal container there were several error cases which were hard to diagnose since they appeared as strerror (errno) values printed by the higher level functions. This change adds three new verbose messages for potential failure paths. The new messages give the user the opportunity to use -v and display additional information about why localedef might be failing. I found these messages useful myself while writing a localedef container test for --no-hard-links. Since the changes cleanup the code that handle codeset normalization we add tst-localedef-path-norm which contains many sub-tests to verify the correct expected normalization of codeset strings both when installing to default paths (the only time normalization is enabled) and installing to absolute paths. During the refactoring I created at least one buffer-overflow which valgrind caught, but these tests did not catch because the exec in the container had a very clean heap with zero-initialized memory. However, between valgrind and the tests the results are clean. The new tst-localedef-path-norm passes without regression on x86_64. Change-Id: I28b9f680711ff00252a2cb15625b774cc58ecb9d	2020-04-26 13:55:58 -04:00
Joseph Myers	ef02e3c476	Fix locale/tst-locale-locpath cross-testing when sshd sets LANG. The locale/tst-locale-locpath test unsets LANG, then runs a test with test_wrapper_env and expects LANG to remain unset for that test. This does not work for cross-testing with cross-test-ssh.sh when sshd (on the system specified as an argument to cross-test-ssh.sh) is configured to have a default LANG setting. The general design used in cross testing, after commit `8540f6d2a7` ("Don't require test wrappers to preserve environment variables, use more consistent environment.", 6 June 2014), is that environment settings required by tests should be passed explicitly to $(test-wrapper-env). This patch changes tst-locale-locpath.sh to pass an explicit LANG= rather than expecting "unset LANG" to be in effect for the program run under test_wrapper_env. Note that this does slightly change the environment in which the test is run natively (empty LANG instead of unset LANG) but that difference does not appear relevant to what it is trying to test. Tested for Arm that this fixes the failure seen for that test in cross-testing.	2020-01-24 17:23:07 +00:00
Joseph Myers	5f72f9800b	Update copyright dates not handled by scripts/update-copyrights. I've updated copyright dates in glibc for 2020. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files. As well as the usual annual updates, mainly dates in --version output (minus libc.texinfo which previously had to be handled manually but is now successfully updated by update-copyrights), there is a fix to sysdeps/unix/sysv/linux/powerpc/bits/termios-c_lflag.h where a typo in the copyright notice meant it failed to be updated automatically. Please remember to include 2020 in the dates for any new files added in future (which means updating any existing uncommitted patches you have that add new files to use the new copyright dates in them).	2020-01-01 00:21:22 +00:00
Joseph Myers	d614a75396	Update copyright dates with scripts/update-copyrights.	2020-01-01 00:14:33 +00:00
Egor Kobylkin	7fc8c286e3	locale: Greek -> ASCII transliteration table [BZ #12031 ]	2019-11-26 12:23:09 +01:00
Mike FABIAN	4ecd584908	Add mnw language code [BZ #25139 ]	2019-11-06 08:15:16 +01:00

1 2 3 4 5 ...

1026 Commits