glibc

mirror of https://sourceware.org/git/glibc.git synced 2024-12-26 04:31:03 +00:00

Author	SHA1	Message	Date
Mike FABIAN	86bdd49d93	Bug 24307: Update to Unicode 12.0.0 Unicode 12.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 12.0.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Some info about the number of characters added or changed: Total added characters in newly generated CHARMAP: 554 Total added characters in newly generated WIDTH: 106 alpha: Missing 8 characters of old ctype in new ctype (These are combining marks, apparently they were removed from alpha on purpose) alpha: Added 295 characters in new ctype which were not in old ctype combining: Missing 2 characters of old ctype in new ctype (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA, these are now "Alphabetic" in Unicode 12.0.0) combining: Added 37 characters in new ctype which were not in old ctype combining_level3: Missing 2 characters of old ctype in new ctype (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA, these are now "Alphabetic" in Unicode 12.0.0) combining_level3: Added 26 characters in new ctype which were not in old ctype graph: Added 554 characters in new ctype which were not in old ctype lower: Added 6 characters in new ctype which were not in old ctype print: Added 554 characters in new ctype which were not in old ctype punct: Missing 29 characters of old ctype in new ctype (These characters have all become "Alphabetic" in Unicode 12.0.0. Therefore, they are not in "punct" anymore (see: is_punct() in unicode_utils.py)) punct: Added 296 characters in new ctype which were not in old ctype tolower: Added 7 characters in new ctype which were not in old ctype totitle: Added 7 characters in new ctype which were not in old ctype toupper: Added 7 characters in new ctype which were not in old ctype upper: Added 7 characters in new ctype which were not in old ctype [BZ #24307] * localedata/unicode-gen/Makefile (UNICODE_VERSION): Set to 12.0.0. * localedata/unicode-gen/DerivedCoreProperties.txt: Update to Unicode 12.0.0. * localedata/unicode-gen/EastAsianWidth.txt: Likewise. * localedata/unicode-gen/PropList.txt: Likewise. * localedata/unicode-gen/UnicodeData.txt: Likewise. * localedata/unicode-gen/ctype_compatibility_test_cases.py: U+108D became "Alphabetic" in Unicode 12.0.0. Adapt test case. * localedata/charmaps/UTF-8: Regenerate. * localedata/locales/i18n_ctype: Likewise. * localedata/locales/tr_TR: Likewise. * localedata/locales/translit_circle: Likewise. * localedata/locales/translit_cjk_compat: Likewise. * localedata/locales/translit_combining: Likewise. * localedata/locales/translit_compat: Likewise. * localedata/locales/translit_font: Likewise. * localedata/locales/translit_fraction: Likewise.	2019-03-08 12:20:35 +01:00
Joseph Myers	04277e02d7	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2019-01-01 00:11:28 +00:00
Mike FABIAN	4beefeeb8e	Put the correct Unicode version number 11.0.0 into the generated files In some places there was still the old Unicode version 10.0.0 in the files. * localedata/charmaps/UTF-8: Use correct Unicode version 11.0.0 in comment. * localedata/locales/i18n_ctype: Use correct Unicode version in comments and headers. * localedata/unicode-gen/utf8_gen.py: Add option to specify Unicode version * localedata/unicode-gen/Makefile: Use option to specify Unicode version for utf8_gen.py	2018-07-10 17:30:31 +02:00
Mike FABIAN	b11643c21c	Bug 23308: Update to Unicode 11.0.0 Unicode 11.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 11.0.0, using the generator scripts contributed by Mike FABIAN (Red Hat). Some info about the number of characters added: Total added characters in newly generated CHARMAP: 684 Total added characters in newly generated WIDTH: 119 alpha: Added 380 characters in new ctype which were not in old ctype combining: Added 56 characters in new ctype which were not in old ctype combining_level3: Added 37 characters in new ctype which were not in old ctype graph: Added 684 characters in new ctype which were not in old ctype lower: Added 82 characters in new ctype which were not in old ctype print: Added 684 characters in new ctype which were not in old ctype punct: Added 304 characters in new ctype which were not in old ctype tolower: Added 79 characters in new ctype which were not in old ctype totitle: Added 33 characters in new ctype which were not in old ctype toupper: Added 79 characters in new ctype which were not in old ctype upper: Added 79 characters in new ctype which were not in old ctype No characters were removed. [BZ #23308] * unicode-gen/Makefile (UNICODE_VERSION): Set to 11.0.0. * localedata/unicode-gen/DerivedCoreProperties.txt: Update to Unicode 11.0.0. * localedata/unicode-gen/EastAsianWidth.txt: likewise. * localedata/unicode-gen/PropList.txt: likewise. * localedata/unicode-gen/UnicodeData.txt: likewise. * localedata/charmaps/UTF-8: Regenerate. * localedata/locales/i18n_ctype: likewise. * localedata/locales/tr_TR: likewise. * localedata/locales/translit_circle: likewise. * localedata/locales/translit_cjk_compat: likewise. * localedata/locales/translit_combining: likewise. * localedata/locales/translit_compat: likewise. * localedata/locales/translit_font: likewise. * localedata/locales/translit_fraction: likewise.	2018-07-04 12:03:33 +02:00
Joseph Myers	688903eb3e	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2018-01-01 00:32:25 +00:00
Rafal Luzynski	1bb3653925	localedata: Once again correct and regenerate i18n_ctype. Following the previous work by Carlos O'Donell the category of LC_CTYPE is correctly set to "i18n:2012" rather than "unicode:2014" and the i18n_ctype file is once again regenerated from scratch to make sure it does not contain any manual additions except the copyright message. Reviewed-by: Carlos O'Donell <carlos@redhat.com> * localedata/unicode-gen/gen_unicode_ctype.py (output_head): category of LC_CTYPE set to "i18n:2012". * localedata/locales/i18n_ctype: Regenerate.	2017-10-31 23:54:47 +01:00
Carlos O'Donell	337ff3c501	localedata: Fix unicode-gen check target. After the transition to generating a distinct file for Unicode ctype information e.g. i18n_ctype, the check target was left with the wrong target name. This patch fixes the check target and regenerates the files with more information than previously used, filling in the the LC_IDENTIFICATION data. Tested on x86_64 by regenerating from Unicode source files, and running checks. Tested by subsequently rebuilding all locales. No regressions in testsuite. Signed-off-by: Carlos O'Donell <carlos@redhat.com> Reported-by: Rafal Luzynski <digitalfreak@lingonborough.com>	2017-10-25 09:17:46 -07:00
Carlos O'Donell	8dc8be75d2	localedata: Reorganize Unicode LC_CTYPE inclusion. The commit does the following things: * Move non-transliteration Unicode generated data to i18n_ctype. * Copy the i18n_ctype data into i18n and add transliteration. In the future, any locale which needs Unicode LC_CTYPE data can also just use `copy i18n_ctype` and get the base character classes and maps without transliteration. Tested by compiling all the locales and my prototype C.UTF-8 which uses it. Signed-off-by: Carlos O'Donell <carlos@redhat.com>	2017-10-13 22:29:52 -07:00
Mike FABIAN	2ae5be041d	Improve utf8_gen.py to set the width for characters with Prepended_Concatenation_Mark property to 1 [BZ #22070] * localedata/unicode-gen/utf8_gen.py: Set the width for characters with Prepended_Concatenation_Mark property to 1 * localedata/charmaps/UTF-8: Updated using the improved script.	2017-09-06 12:39:49 +02:00
Mike FABIAN	af83ed5c46	Write all ranges of neighbouring characters with the same width using the range notation in charmaps/UTF-8 Writing ranges of neighbouring characters with the same with like this <U000E0100>...<U000E01EF> 0 in charmaps/UTF-8 is more efficient than writing many single character lines like: <U000E0100> 0 <U000E0101> 0 ... [BZ #21750] * unicode-gen/utf8_gen.py: Write all ranges of neighbouring characters with the same width using the range notation in charmaps/UTF-8.	2017-09-06 12:37:49 +02:00
Thorsten Glaser	267ee5d7ab	Resolve some historically special cases of ambiguous width [BZ #21750] * unicode-gen/utf8_gen.py (U+00AD): Set width to 1. * unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0. * unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2. * unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise.	2017-08-17 11:06:08 +02:00
Thorsten Glaser	41b6f0ce85	Handle more cases of combining characters [BZ #21750] * unicode-gen/utf8_gen.py: Treat category Me and Mn as combining.	2017-08-17 11:06:08 +02:00
Thorsten Glaser	580be3035d	UnicodeData has precedence over EastAsianWidth [BZ #19852] [BZ #21750] * unicode-gen/utf8_gen.py: Process EastAsianWidth lines before UnicodeData lines so the latter have precedence; remove hack to group output by EastAsianWidth ranges.	2017-08-17 11:06:08 +02:00
Mike FABIAN	925fac7793	Bug 21533: Update to Unicode 10.0.0 * Unicode 10.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 10.0.0, using generator scripts contributed by Mike FABIAN (Red Hat).	2017-06-22 17:02:55 +02:00
Mike FABIAN	0b38d66a4e	Bug 20313: Update to Unicode 9.0.0 * Unicode 9.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 9.0.0, using generator scripts contributed by Mike FABIAN (Red Hat).	2017-02-21 06:30:38 -05:00
Joseph Myers	bfff8b1bec	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
Mike Frysinger	277da2ab88	unicode-gen: include standard comment file header We deployed this header to all the locale files, so make sure we include it in the generated ones too so we don't lose it.	2016-06-11 02:10:52 -04:00
Joseph Myers	f7a9f785e5	Update copyright dates with scripts/update-copyrights.	2016-01-04 16:05:18 +00:00
Joseph Myers	85bafe6f3d	Automate LC_CTYPE generation for tr_TR, update to Unicode 8.0.0 (bug 18491). This patch makes the automation of Unicode LC_CTYPE generation also support generating the modified LC_CTYPE used for Turkish (where case conversions of 'i' and 'I' differ from ASCII conventions), so allowing that to be more readily kept in sync for future Unicode updates. The patch includes the locale update generated by the scripts. Tested for x86_64. [BZ #18491] * unicode-gen/unicode_utils.py (to_upper_turkish): New function. (to_lower_turkish): Likewise. * unicode-gen/gen_unicode_ctype.py (output_tables): Support producing output with Turkish case conversions. (--turkish): New command-line option. * unicode-gen/Makefile (GENERATED): Add tr_TR. (tr_TR): New rule. * locales/tr_TR: Regenerate LC_CTYPE.	2015-12-11 12:45:19 +00:00
Mike FABIAN	23256f5ed8	Update to Unicode 8.0.0. Update __STDC_ISO_10646__ to 201505L for Unicode 8.0.0. Update character encoding, ctype, and transliteration tables. New scripts autogenerate transliteration tables.	2015-12-10 00:33:48 -05:00
Carlos O'Donell	dd8e8e5476	Update transliteration support to Unicode 7.0.0. The transliteration files are now autogenerated from upstream Unicode data.	2015-12-09 22:52:13 -05:00
Alexandre Oliva	7b1ec6a05c	Amendments to Unicode 7 update. for ChangeLog * include/stdc-predef.h (__STDC_ISO_10646__): Update to 201304L, for Unicode 7. for localedata/ChangeLog * unicode-gen/ctype_compatibility.py: Use date ranges in copyright notice. * unicode-gen/ctype_compatibility_test_cases.py: Likewise. * unicode-gen/gen_unicode_ctype.py: Likewise. * unicode-gen/utf8_compatibility.py: Likewise. * unicode-gen/utf8_gen.py: Likewise. Use upper case for global variables, use tuples for global constant arrays. From Mike FABIAN. Suggested by Mike Frysinger <vapier@gentoo.org>.	2015-02-23 11:35:24 -03:00
Alexandre Oliva	4a4839c94a	Unicode 7.0.0 update; added generator scripts. for localedata/ChangeLog [BZ #17588] [BZ #13064] [BZ #14094] [BZ #17998] * unicode-gen/Makefile: New. * unicode-gen/unicode-license.txt: New, from Unicode. * unicode-gen/UnicodeData.txt: New, from Unicode. * unicode-gen/DerivedCoreProperties.txt: New, from Unicode. * unicode-gen/EastAsianWidth.txt: New, from Unicode. * unicode-gen/gen_unicode_ctype.py: New generator, from Mike FABIAN <mfabian@redhat.com>. * unicode-gen/ctype_compatibility.py: New verifier, from Pravin Satpute <psatpute@redhat.com> and Mike FABIAN. * unicode-gen/ctype_compatibility_test_cases.py: New verifier module, from Mike FABIAN. * unicode-gen/utf8_gen.py: New generator, from Pravin Satpute and Mike FABIAN. * unicode-gen/utf8_compatibility.py: New verifier, from Pravin Satpute and Mike FABIAN. * charmaps/UTF-8: Update. * locales/i18n: Update. * gen-unicode-ctype.c: Remove. * tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns true for ordinal indicators.	2015-02-20 20:14:59 -02:00

23 Commits