Commit Graph

919 Commits

Author SHA1 Message Date
Carlos O'Donell
7cd7d36f1f Keep expected behaviour for [a-z] and [A-z] (Bug 23393).
In commit 9479b6d5e0 we updated all of
the collation data to harmonize with the new version of ISO 14651
which is derived from Unicode 9.0.0.  This collation update brought
with it some changes to locales which were not desirable by some
users, in particular it altered the meaning of the
locale-dependent-range regular expression, namely [a-z] and [A-Z], and
for en_US it caused uppercase letters to be matched by [a-z] for the
first time.  The matching of uppercase letters by [a-z] is something
which is already known to users of other locales which have this
property, but this change could cause significant problems to en_US
and other similar locales that had never had this change before.
Whether this behaviour is desirable or not is contentious and GNU Awk
has this to say on the topic:
https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html
While the POSIX standard also has this further to say: "RE Bracket
Expression":
http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html
"The current standard leaves unspecified the behavior of a range
expression outside the POSIX locale. ... As noted above, efforts were
made to resolve the differences, but no solution has been found that
would be specific enough to allow for portable software while not
invalidating existing implementations."
In glibc we implement the requirement of ISO POSIX-2:1993 and use
collation element order (CEO) to construct the range expression, the
API internally is __collseq_table_lookup().  The fact that we use CEO
and also have 4-level weights on each collation rule means that we can
in practice reorder the collation rules in iso14651_t1_common (the new
data) to provide consistent range expression resolution *and* the
weights should maintain the expected total order.  Therefore this
patch does three things:

* Reorder the collation rules for the LATIN script in
  iso14651_t1_common to deinterlace uppercase and lowercase letters in
  the collation element orders.

* Adds new test data en_US.UTF-8.in for sort-test.sh which exercises
  strcoll* and strxfrm* and ensures the ISO 14651 collation remains.

* Add back tests to tst-fnmatch.input and tst-regexloc.c which
  exercise that [a-z] does not match A or Z.

The reordering of the ISO 14651 data is done in an entirely mechanical
fashion using the following program attached to the bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c28

It is up for discussion if the iso14651_t1_common data should be
refined further to have 3 very tight collation element ranges that
include only a-z, A-Z, and 0-9, which would implement the solution
sought after in:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c12
and implemented here:
https://www.sourceware.org/ml/libc-alpha/2018-07/msg00854.html

No regressions on x86_64.
Verified that removal of the iso14651_t1_common change causes tst-fnmatch
to regress with:
422: fnmatch ("[a-z]", "A", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
...
425: fnmatch ("[A-Z]", "z", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
2018-07-25 17:00:45 -04:00
Quentin PAGÈS
df467d229a oc_FR locale: Multiple updates (bug 23140, bug 23422).
Multiple updates for Occitan language including alternative month names,
update abday and abmon, fix typos in day, fix d_fmt, correct LC_NAME,
and use “copy "ca_ES"” as LC_COLLATE.

	[BZ #23140]
	* localedata/locales/oc_FR (mon): Rename to...
	(alt_mon): This, then update October (typo fix).
	(mon): New content (genitive case, month names preceded by
	"de" or "d’").

	[BZ #23422]
	* localedata/locales/oc_FR (abday): Update all items.
	(day): Update Wednesday and Saturday (typo fixes).
	(abmon): Update all items, except May.
	(d_fmt): Update "%d.%m.%Y" -> "%d/%m/%Y".
	(LC_IDENTIFICATION): Bump the revision number and date.
	Keep the "category" entries in alphabetic order.
	(LC_ADDRESS): Remove no longer needed comment.
	(LC_COLLATE): Use “copy "ca_ES"”.
	(LC_NAME): Set the correct values of "name_fmt", "name_mr", and
	"name_mrs".

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-07-18 23:17:17 +02:00
Valery Timiriliyev
61c4aad705 New locale: Yakut (Sakha) for Russia (sah_RU) [BZ #22241]
* localedata/Makefile (test-input): Add sah_RU.UTF-8.
	(LOCALES): Likewise.
	* localedata/SUPPORTED (sah_RU/UTF-8): New entry.
	* localedata/locales/sah_RU: New file.
	* localedata/sah_RU.UTF-8.in: New file.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-07-18 11:45:44 +02:00
Rafal Luzynski
9145f0333d os_RU: Add alternative month names (bug 23140).
[BZ #23140]
	* localedata/locales/os_RU (mon): Rename to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-07-17 23:58:56 +02:00
Rafal Luzynski
0a83bad2aa dsb_DE locale: Fix syntax error and add tests (bug 23208).
Fixed syntax error in the collation rules of Lower Sorbian language.
Collation test added in order to test the bugs like this early.

Reported-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

	[BZ #23208]
	* localedata/Makefile (test-input): Add dsb_DE.UTF-8.
	(LOCALES): Likewise.
	* localedata/dsb_DE.UTF-8.in: New file.
	* localedata/locales/dsb_DE (LC_COLLATE): Fix syntax error.
2018-07-13 23:06:32 +02:00
Mike FABIAN
4beefeeb8e Put the correct Unicode version number 11.0.0 into the generated files
In some places there was still the old Unicode version 10.0.0 in the files.

	* localedata/charmaps/UTF-8: Use correct Unicode version 11.0.0 in comment.
	* localedata/locales/i18n_ctype: Use correct Unicode version in comments
	and headers.
	* localedata/unicode-gen/utf8_gen.py: Add option to specify Unicode version
	* localedata/unicode-gen/Makefile: Use option to specify Unicode version
	for utf8_gen.py
2018-07-10 17:30:31 +02:00
Mike FABIAN
b11643c21c Bug 23308: Update to Unicode 11.0.0
Unicode 11.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 11.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Some info about the number of characters added:

Total added characters in newly generated CHARMAP: 684
Total added characters in newly generated WIDTH: 119
alpha: Added 380 characters in new ctype which were not in old ctype
combining: Added 56 characters in new ctype which were not in old ctype
combining_level3: Added 37 characters in new ctype which were not in old ctype
graph: Added 684 characters in new ctype which were not in old ctype
lower: Added 82 characters in new ctype which were not in old ctype
print: Added 684 characters in new ctype which were not in old ctype
punct: Added 304 characters in new ctype which were not in old ctype
tolower: Added 79 characters in new ctype which were not in old ctype
totitle: Added 33 characters in new ctype which were not in old ctype
toupper: Added 79 characters in new ctype which were not in old ctype
upper: Added 79 characters in new ctype which were not in old ctype

No characters were removed.

	[BZ #23308]
	* unicode-gen/Makefile (UNICODE_VERSION): Set to 11.0.0.
	* localedata/unicode-gen/DerivedCoreProperties.txt: Update to Unicode 11.0.0.
	* localedata/unicode-gen/EastAsianWidth.txt: likewise.
	* localedata/unicode-gen/PropList.txt: likewise.
	* localedata/unicode-gen/UnicodeData.txt: likewise.
	* localedata/charmaps/UTF-8: Regenerate.
	* localedata/locales/i18n_ctype: likewise.
	* localedata/locales/tr_TR: likewise.
	* localedata/locales/translit_circle: likewise.
	* localedata/locales/translit_cjk_compat: likewise.
	* localedata/locales/translit_combining: likewise.
	* localedata/locales/translit_compat: likewise.
	* localedata/locales/translit_font: likewise.
	* localedata/locales/translit_fraction: likewise.
2018-07-04 12:03:33 +02:00
Michael Wolf
a1e0c5fa88 New locale: Lower Sorbian (dsb_DE) [BZ #23208]
[BZ #23208]
	* localedata/SUPPORTED (dsb_DE/UTF-8): New entry.
	* localedata/locales/dsb_DE: New file.
2018-06-29 23:03:06 +02:00
Rafal Luzynski
2e0c5de622 hy_AM: Add alternative month names (bug 23140).
This locale already contained correct data in mon array.  Updated from
CLDR to start the month names with the lowercase letters.

alt_mon is a new import from CLDR.  The change has been consulted
off-list with a native speaker.

	[BZ #23140]
	* localedata/locales/hy_AM (mon): Synchronize with CLDR (lowercase,
	genitive case).
	(alt_mon): New entry, import from CLDR (nominative case).
2018-06-29 22:18:24 +02:00
Sylvain Lesage
cdb52c7182 es_BO locale: Change LC_PAPER to en_US (bug 22996).
[BZ #22996]
	* localedata/locales/es_BO (LC_PAPER): Change to “copy "en_US"”.
2018-06-29 21:45:16 +02:00
Rafal Luzynski
339124ab42 ast_ES: Add alternative month names (bug 23140).
[BZ #23140]
	* localedata/locales/ast_ES (mon): Rename to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).
2018-06-29 12:36:53 +02:00
Rafal Luzynski
189699ab37 csb_PL: Add alternative month names (bug 23140).
Kashubian language is not supported by CLDR, data copied from Wikipedia
and documents released by RJK (official Kashubian Language Council),
also consulted with a native speaker.

Note that this language also needs ab_alt_mon feature due to the month
May: nominative "môj", genitive "maja"; abbreviated nominative "môj",
abbreviated genitive "maj".

	[BZ #23140]
	* localedata/locales/csb_PL (mon): Rename to...
	(alt_mon): This.
	(abmon): Rename to...
	(ab_alt_mon): This.
	(mon): Add with proper genitive forms, copy from Wikipedia.
	(abmon): Likewise.
2018-06-25 12:34:31 +02:00
Rafal Luzynski
0ea3f13cce csb_PL: Update month translations + add yesstr/nostr (bug 19485).
Thank you Michal Ostrowski for the feedback.

	[BZ #19485]
	* localedata/locales/csb_PL (mon): Fix typos:
	"łżëkwiôt" -> "łżëkwiat" (April); "lëpinc" -> "lëpińc" (July).
	(yesstr): Add, value is "jo".
	(nostr): Add, value is "nié".
2018-06-25 12:32:51 +02:00
Rafal Luzynski
c4ad5782c4 gd_GB, hsb_DE, wa_BE: Add alternative month names (bug 23140).
As a followup of fixing bug 10871, these three languages now support two
grammatical cases of the month names.

This commit does not resolve the bug because there are more languages
to be committed.

	[BZ #23140]
	* localedata/locales/gd_GB (mon): Rename to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).
	* localedata/locales/hsb_DE (mon): Rename to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).
	* localedata/locales/wa_BE (mon): Rename to...
	(alt_mon): This.
	(mon): Add, fill with the proper genitive forms, but CLDR data
	is incomplete; completed according to the comments in this file.
	(d_t_fmt): Do not use "di" before the month name, no longer needed.

	* localedata/locales/wa_BE (country_name): Reword
	"Beljike" -> "Beldjike".
2018-06-12 01:33:55 +02:00
Rafal Luzynski
bb066cb806 gd_GB: Fix typo in abbreviated "May" (bug 23152).
[BZ #23152]
	* localedata/locales/gd_GB (abmon): Fix typo in May:
	"Mhàrt" -> "Cèit".  Adjust the comment according to the change.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-05-11 00:00:10 +02:00
Dragan Stanojevic - Nevidljivi
ea76691a75 hr_HR locale: fix thousands_sep and mon_thousands_sep
[BZ #23094]
	* localedata/locales/hr_HR: fix thousands_sep and
	mon_thousands_sep
2018-04-23 17:00:26 +02:00
Rafal Luzynski
807fee29d2 cs_CZ locale: Add alternative month names (bug 22963).
Add alternative month names, primary month names are genitive now.

	[BZ #22963]
	* localedata/locales/cs_CZ (mon): Rename to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).
2018-03-15 01:21:02 +01:00
Rafal Luzynski
e7155a28ef Greek (el_CY, el_GR) locales: Introduce ab_alt_mon (bug 22937).
As spotted by GNOME translation team, Greek language has the actually
visible difference between the abbreviated nominative and the abbreviated
genitive case for some month names.  Examples:

May:

abbreviated nominative: "Μάι" -> abbreviated genitive: "Μαΐ"

July:

abbreviated nominative: "Ιούν" -> abbreviated genitive: "Ιουλ"

and more month names with similar differences.

Original discussion: https://bugzilla.gnome.org/show_bug.cgi?id=793645#c21

	[BZ #22937]
	* localedata/locales/el_CY (abmon): Rename to...
	(ab_alt_mon): This.
	(abmon): Import from CLDR (abbreviated genitive case).
	* localedata/locales/el_GR (abmon): Rename to...
	(ab_alt_mon): This.
	(abmon): Import from CLDR (abbreviated genitive case).
2018-03-15 01:11:05 +01:00
Rafal Luzynski
71d7b12168 lt_LT locale: Update abbreviated month names (bug 22932).
A GNOME translator asked to use the same abbreviated month names
as provided by CLDR.  This sounds reasonable.  See the discussion:
https://bugzilla.gnome.org/show_bug.cgi?id=793645#c27

	[BZ #22932]
	* localedata/locales/lt_LT (abmon): Synchronize with CLDR.
2018-03-15 01:08:49 +01:00
Robert Buj
a00bffe8b5 ca_ES locale: Update LC_TIME (bug 22848).
Add/fix alternative month names, long & short formats, am_pm,
abday settings, and improve indentation for Catalan.

	[BZ #22848]
	* localedata/locales/ca_ES (abmon): Rename to...
	(ab_alt_mon): This, then synchronize with CLDR (nominative case).
	(mon): Rename to...
	(alt_mon): This.
	(abmon): Import from CLDR (genitive case, month names preceded by
	"de" or "d’").
	(mon): Likewise.
	(abday): Synchronize with CLDR.
	(d_t_fmt): Likewise.
	(d_fmt): Likewise.
	(am_pm): Likewise.

	(LC_TIME): Improve indentation.
	(LC_TELEPHONE): Likewise.
	(LC_NAME): Likewise.
	(LC_ADDRESS): Likewise.
2018-03-15 01:05:19 +01:00
Mike FABIAN
a527f09cd1 an_ES locale: update some locale data [BZ #22896]
[BZ #22896]
	* localedata/locales/an_ES: update month and day names,
	improve d_fmt, improve postal_fmt, add country_post,
	add country_isbn
2018-03-01 15:06:24 +01:00
Mike FABIAN
35d660b01e bg_BG locale: Fix a typo in a comment
* localedata/locales/bg_BG (LC_COLLATE): The comment mentioned
	Ukrainian instead of Bulgarian.
2018-03-01 14:52:26 +01:00
Mike FABIAN
1597385481 Adapt collation in several locales to the new iso14651_t1_common file
[BZ #22550] - es_ES locale (and other es_* locales): collation should
treat ñ as a primary different character, sync the collation
for Spanish with CLDR
[BZ #21547] - Tibetan script collation broken (Dzongkha and Tibetan)

	* localedata/Makefile: Add new test files.
	* localedata/lv_LV.UTF-8.in: Adapt test file to new collation order.
	* localedata/sv_SE.ISO-8859-1.in: Adapt test file to new collation order.
	* localedata/uk_UA.UTF-8.in: Adapt test file to new collation order.
	* localedata/am_ET.UTF-8.in: New test file.
	* localedata/az_AZ.UTF-8.in: Likewise.
	* localedata/be_BY.UTF-8.in: Likewise.
	* localedata/ber_DZ.UTF-8.in: Likewise.
	* localedata/ber_MA.UTF-8.in: Likewise.
	* localedata/bg_BG.UTF-8.in: Likewise.
	* localedata/br_FR.UTF-8.in: Likewise.
	* localedata/cmn_TW.UTF-8.in: Likewise.
	* localedata/crh_UA.UTF-8.in: Likewise.
	* localedata/csb_PL.UTF-8.in: Likewise.
	* localedata/cv_RU.UTF-8.in: Likewise.
	* localedata/cy_GB.UTF-8.in: Likewise.
	* localedata/dz_BT.UTF-8.in: Likewise.
	* localedata/eo.UTF-8.in: Likewise.
	* localedata/es_ES.UTF-8.in: Likewise.
	* localedata/fa_IR.UTF-8.in: Likewise.
	* localedata/fi_FI.UTF-8.in: Likewise.
	* localedata/fil_PH.UTF-8.in: Likewise.
	* localedata/fur_IT.UTF-8.in: Likewise.
	* localedata/gez_ER.UTF-8@abegede.in: Likewise.
	* localedata/ha_NG.UTF-8.in: Likewise.
	* localedata/ig_NG.UTF-8.in: Likewise.
	* localedata/ik_CA.UTF-8.in: Likewise.
	* localedata/kk_KZ.UTF-8.in: Likewise.
	* localedata/ku_TR.UTF-8.in: Likewise.
	* localedata/ky_KG.UTF-8.in: Likewise.
	* localedata/ln_CD.UTF-8.in: Likewise.
	* localedata/mi_NZ.UTF-8.in: Likewise.
	* localedata/ml_IN.UTF-8.in: Likewise.
	* localedata/mn_MN.UTF-8.in: Likewise.
	* localedata/mr_IN.UTF-8.in: Likewise.
	* localedata/mt_MT.UTF-8.in: Likewise.
	* localedata/nb_NO.UTF-8.in: Likewise.
	* localedata/om_KE.UTF-8.in: Likewise.
	* localedata/os_RU.UTF-8.in: Likewise.
	* localedata/ps_AF.UTF-8.in: Likewise.
	* localedata/ro_RO.UTF-8.in: Likewise.
	* localedata/ru_RU.UTF-8.in: Likewise.
	* localedata/sc_IT.UTF-8.in: Likewise.
	* localedata/se_NO.UTF-8.in: Likewise.
	* localedata/sq_AL.UTF-8.in: Likewise.
	* localedata/sv_SE.UTF-8.in: Likewise.
	* localedata/szl_PL.UTF-8.in: Likewise.
	* localedata/tg_TJ.UTF-8.in: Likewise.
	* localedata/tk_TM.UTF-8.in: Likewise.
	* localedata/tt_RU.UTF-8.in: Likewise.
	* localedata/tt_RU.UTF-8@iqtelif.in: Likewise.
	* localedata/ug_CN.UTF-8.in: Likewise.
	* localedata/uz_UZ.UTF-8.in: Likewise.
	* localedata/vi_VN.UTF-8.in: Likewise.
	* localedata/yi_US.UTF-8.in: Likewise.
	* localedata/yo_NG.UTF-8.in: Likewise.
	* localedata/zh_CN.UTF-8.in: Likewise.
	* localedata/locales/am_ET: Adapt collation rules to new iso14651_t1_common
        file and fix bugs in the collation.
	* localedata/locales/az_AZ: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/br_FR: Likewise.
	* localedata/locales/br_FR@euro: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/cns11643_stroke: Likewise.
	* localedata/locales/crh_UA: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/csb_PL: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/en_CA: Likewise.
	* localedata/locales/eo: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_EC: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/es_US: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fil_PH: Likewise.
	* localedata/locales/fur_IT: Likewise.
	* localedata/locales/gez_ER@abegede: Likewise.
	* localedata/locales/ha_NG: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/hsb_DE: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/ig_NG: Likewise.
	* localedata/locales/ik_CA: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/iso14651_t1_pinyin: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/ku_TR: Likewise.
	* localedata/locales/ky_KG: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mi_NZ: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/mn_MN: Likewise.
	* localedata/locales/mr_IN: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/ps_AF: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/ru_UA: Likewise.
	* localedata/locales/sc_IT: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/sv_FI: Likewise.
	* localedata/locales/sv_FI@euro: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/szl_PL: Likewise.
	* localedata/locales/tg_TJ: Likewise.
	* localedata/locales/ti_ER: Likewise.
	* localedata/locales/tk_TM: Likewise.
	* localedata/locales/tl_PH: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/tt_RU: Likewise.
	* localedata/locales/tt_RU@iqtelif: Likewise.
	* localedata/locales/ug_CN: Likewise.
	* localedata/locales/uk_UA: Likewise.
	* localedata/locales/uz_UZ: Likewise.
	* localedata/locales/uz_UZ@cyrillic: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yo_NG: Likewise.
2018-02-27 17:47:50 +01:00
Mike FABIAN
df74ef786f Add sections for various scripts to the iso14651_t1_common file
* localedata/locales/iso14651_t1_common: Add sections for various
	scripts to the iso14651_t1_common file.
2018-02-27 16:52:54 +01:00
Mike FABIAN
d5adfbadd4 iso14651_t1_common: make the fourth level the codepoint for characters which are ignorable on all 4 levels
Entries for characters which have “IGNORE” on all 4 levels like:

 <U0001> IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429)

are changed into:

 <U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING (in ISO 6429)

i.e. putting the code point of the character into the fourth level
instead of “IGNORE”. Without that change, all such characters
would compare equal which would make a wcscoll test case fail.
It is better to have a clearly defined sort order even for characters
like this so it is good to use the code point as a tie-break.

	* localedata/locales/iso14651_t1_common: Use the code point of a
        character in the fourth collation level instead of IGNORE for all
        entries which have IGNORE on all 4 levels.
2018-02-27 16:50:30 +01:00
Mike FABIAN
5f5a961091 Add convenience symbols like <AFTER-A>, <BEFORE-A> to iso14651_t1_common
* localedata/locales/iso14651_t1_common: Add some convenient collation
	symbols like <AFTER-A>, <BEFORE-A> to make tailoring easier using
	rules similar to those in CLDR.
2018-02-27 16:47:22 +01:00
Mike FABIAN
8a97e9002f Fixing syntax errors after updating the iso14651_t1_common file
* localedata/locales/iso14651_t1_common: The new version of this
	file downloaded from ISO contained several syntax errors which
	are fixed by this patch.
2018-02-27 16:45:30 +01:00
Mike FABIAN
bbdd2fba7d iso14651_t1_common: <U\([0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]\)> → <U000\1>
* localedata/locales/iso14651_t1_common: replace all <U.....>
	with <U000.....> because glibc understands only 4 digit or 8 digit
2018-02-27 16:44:03 +01:00
Mike FABIAN
1569e551af Necessary changes after updating the iso14651_t1_common file
* localedata/locales/iso14651_t1_common: Necessary changes
	to make the file downloaded from ISO usable by glibc.
2018-02-27 16:42:14 +01:00
Mike FABIAN
9479b6d5e0 Update iso14651_t1_common file to ISO14651_2016_TABLE1_en.txt [BZ #14095]
[BZ #14095] - Review / update collation data from Unicode / ISO 14651

File downloaded from:
http://standards.iso.org/iso-iec/14651/ed-4/ISO14651_2016_TABLE1_en.txt

Updating this file alone is not enough, there are problems in the new
file which need to be fixed and the collation rules for many locales
need to be adapted. This is done by the following patches.

This update also fixes the problem that many characters are treated as
identical when sorting because they were not yet in the old
iso14651_t1_common file, see:

https://bugzilla.redhat.com/show_bug.cgi?id=1336308
- Infinite (∞) and empty set (∅) are treated as if they were the same character by sort and uniq

	[BZ #14095]
	* localedata/locales/iso14651_t1_common: Update file to
	latest version from ISO (ISO14651_2016_TABLE1_en.txt).
2018-02-27 16:36:31 +01:00
Mike FABIAN
9d5cfd8e83 Use / instead of - in d_fmt for pt_BR and pt_PT [BZ #17438]
[BZ #17438]
	* localedata/locales/pt_BR (LC_TIME): use / instead of -
	in d_fmt.
	* localedata/locales/pt_PT (LC_TIME): likewise
2018-02-23 09:50:29 +01:00
Mike FABIAN
6c7269f31d Use “copy "es_BO"” in LC_TIME of es_CU, es_CL, and es_EC
LC_TIME in these 4 locales is identical, using “copy "es_BO"” makes
that more obvious.

	[BZ #22646]
	* localedata/locales/es_CL (LC_TIME): copy "es_BO".
	* localedata/locales/es_CU (LC_TIME): copy "es_BO".
	* localedata/locales/es_EC (LC_TIME): copy "es_BO".
2018-02-23 09:49:03 +01:00
Mike FABIAN
7ec5f9465e Add missing “reorder-end” in LC_COLLATE of et_EE [BZ #22517]
[BZ #22517]
	* localedata/locales/et_EE (LC_COLLATE): add missing “reorder-end”
2018-02-21 16:36:39 +01:00
Rafal Luzynski
9a1b267d47 hr_HR: Add alternative month names (bug 10871).
[BZ #10871]
	* localedata/locales/hr_HR (mon): Rename to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).
	(d_t_fmt): Update the comment.
2018-01-30 12:48:17 +01:00
Rafal Luzynski
8b406f8776 lt_LT: Add alternative month names (bug 10871).
[BZ #10871]
	* localedata/locales/lt_LT (alt_mon): Import from CLDR (nominative
	case).
2018-01-29 13:14:45 +01:00
Rafal Luzynski
105e90bd83 be_BY, be_BY@latin: Add alternative month names (bug 10871).
This patch also fixes spelling of lang_name in be_BY@latin, as reported
by Ihar Hrachyshka.

	[BZ #10871]
	* localedata/locales/be_BY (mon): Rename to...
	(alt_mon): This, then synchronize with CLDR (nominative case).
	(abmon): Rename to...
	(ab_alt_mon): This, then synchronize with CLDR (nominative case).
	(mon): Import from CLDR (genitive case).
	(abmon): Likewise.
	* localedata/locales/be_BY@latin (mon): Rename to...
	(alt_mon): This.
	(mon): Add, proper genitive forms provided by Viktar Siarheichyk.

	* localedata/locales/be_BY@latin (lang_name): Reworded to
	"biełaruskaja mova".
2018-01-29 13:14:45 +01:00
Rafal Luzynski
561cb41473 el_CY, el_GR: Add alternative month names (bug 10871).
[BZ #10871]
	* localedata/locales/el_CY (mon): Renamed to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).
	* localedata/locales/el_GR: Likewise.
2018-01-29 13:14:45 +01:00
Rafal Luzynski
f7bdf30d15 ru_RU, ru_UA: Add alternative month names (bug 10871).
[BZ #10871]
	* localedata/locales/ru_RU (mon): Rename to...
	(alt_mon): This.
	(abmon): Rename to...
	(ab_alt_mon): This.
	(mon): Import from CLDR (genitive case).
	(abmon): Copy from the old content except the 5th month which is
	now in the genitive case, even when abbreviated.
	* localedata/locales/ru_UA: Likewise.
	* time/tst-strptime.c (day_tests): Add an actual example of
	a difference between %b and %Ob in Russian.
2018-01-29 13:14:45 +01:00
Rafal Luzynski
86530b9fed uk_UA: Add alternative month names (bug 10871).
Primary month names are in a genitive case now, alternative month names
are in a nominative case.

The alternative digits hack is no longer needed and has been removed.

	[BZ #10871]
	* localedata/locales/uk_UA (mon): Renamed to...
	(alt_mon): This.
	(alt_digits): "0" removed and then renamed to...
	(mon): This.
	(date_fmt): Definition changed not to use the alternative
	digits hack.
2018-01-25 14:01:43 +01:00
Rafal Luzynski
2aa8009d21 pl_PL: Add alternative month names (bug 10871).
[BZ #10871]
	* localedata/locales/pl_PL: Alternative month names added,
	primary month names are genitive now.
	* time/tst-strptime.c (day_tests): Actually use a genitive case
	of a month name in Polish language.
2018-01-22 11:27:09 +01:00
Rafal Luzynski
32ac6e927d locales gu_IN, lo_LA: Fix obvious typos in dates.
Reported-by: Robert Pluim <rpluim@gmail.com>

	* localedata/locales/gu_IN (LC_IDENTIFICATION): Fix an obvious typo
	in date: "2004-14-09" should be "2004-09-14".
	* localedata/locales/lo_LA: Fix an obvious typo in date in the header:
	"2003-15-09" should be "2003-09-15".
2018-01-19 01:09:12 +01:00
Rafal Luzynski
e234d7cb9a locales bho_NP, mai_IN, mai_NP: Fix an obvious typo in date.
* localedata/locales/bho_NP (LC_IDENTIFICATION): Fix an obvious typo
	in date: "2017-24-07" should be "2017-07-24".
	* localedata/locales/mai_IN: Likewise.
	* localedata/locales/mai_NP: Likewise.
2018-01-18 01:27:10 +01:00
Egmont Koblinger
f172187b2d hu_HU locale: Avoid double space (bug 22657).
The current date format prefixes one-digit days with a space, resulting
in ugly two spaces:

$ LC_ALL=hu_HU.UTF-8 date
2018. jan.  1., hétfő, 21:25:35 CET
          ^^

The official orthography rules doesn't contain an explicit rule about
this (which already gives no sane reason for double space), and an
implicit example of "1848. március 9." under bullet point 296 at
http://helyesiras.mta.hu/helyesiras/default/akh12 contains a single
space only. It's sure not convincing on an HTML page, but I confirm
that the official book edition (e.g.
https://www.libri.hu/en/konyv/a-magyar-helyesiras-szabalyai-32.html)
also contains a single space there.

	[BZ #22657]
	* localedata/locales/hu_HU (d_t_fmt): Avoid a leading space
	before the day number which may produce a double space.
	(date_fmt): Likewise.
2018-01-12 01:57:31 +01:00
Mike FABIAN
d89756ebe1 lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]
[BZ #22524]
	* localedata/Makefile: Add lt_LT.UTF-8 to test-input
	and to the list of locales to be built for testing.
	* localedata/lt_LT.UTF-8.in: New file for testing the collation.
	* localedata/locales/lt_LT (LC_COLLATE): Use “copy "iso14651_t1"”
	and build the collation rules upon that.
2017-12-07 15:38:11 +01:00
Mike FABIAN
62ea2193ee hsb_DE locale: Base collation on copy "iso14651_t1" [BZ #22515]
[BZ #22515]
	* localedata/Makefile: Add hsb_DE.UTF-8 to test-input
	and to the list of locales to be built for testing.
	* localedata/hsb_DE.UTF-8.in: New file for testing the collation.
	* localedata/locales/hsb_DE (LC_COLLATE): Use “copy "iso14651_t1"”
	and build the collation rules upon that.
2017-12-06 12:04:29 +01:00
Mike FABIAN
de9661d6be et_EE locale: Base collation on iso14651_t1 [BZ #22517]
[BZ #22517]
	* localedata/Makefile: Add et_EE.UTF-8 to test-input
	and to the list of locales to be built for testing.
	* localedata/et_EE.UTF-8.in: New file for testing the collation.
	* localedata/locales/et_EE (LC_COLLATE): Use “copy "iso14651_t1"”
        and build the collation rules upon that.
2017-12-05 17:10:08 +01:00
Mike FABIAN
96b06a19e6 tr_TR locale: Base collation on iso14651_t1 [BZ #22527]
[BZ #22527]
	*  localedata/locales/tr_TR (LC_COLLATE): Base collation rules
	on iso14651_t1. A test file localedata/tr_TR.UTF-8.in is already
	available, this rewrite of the collation rules does reproduce
	the test file in the same order.
2017-12-04 18:36:01 +01:00
Mike FABIAN
1f6d91f328 hr_HR locale: Don’t use single code points for the digraphs in LC_TIME
[BZ #10580]
	* localedata/locales/hr_HR (LC_TIME): Use two letters for the
	digraphs in the month and day names. Using single code points for
	digraphs is deprecated.  While there are dedicated Unicode
	codepoints, for the digraphs, these are included for backwards
	compatibility and modern texts use a sequence of Basic Latin
	characters. See: https://www.unicode.org/faq/ligature_digraph.html
	This makes the month and day names agree exactly with CLDR now,
	CLDR does not use the single code points for the digraphs either.
2017-12-04 18:36:01 +01:00
Mike FABIAN
d985adae22 is_IS locale: Base collation on iso14651_t1 [BZ #22519] 2017-12-01 14:08:58 +01:00
Mike FABIAN
fbb5fd03d3 sr_RS and bs_BA locales: make collation rules the same as for hr_HR [BZ #22534]
According to CLDR, collation rules for Serbian and Bosnian
	should be the same as for Croatian.

	[BZ #22534]
	* localedata/Makefile: Add sr_RS.UTF-8 and bs_BA.UTF-8 to test-input
	and to the list of locales to be built for testing.
	* localedata/bs_BA.UTF-8.in: New file (same as hr_HR.UTF-8.in).
	* localedata/sr_RS.UTF-8.in: New file (same as hr_HR.UTF-8.in).
	* localedata/locales/bs_BA (LC_COLLATE): Use “copy "hr_HR"”.
	* localedata/locales/sr_RS (LC_COLLATE): Use “copy "hr_HR"”.
2017-11-30 16:03:22 +01:00