* localedata/locales/iso14651_t1_common: Add some convenient collation
symbols like <AFTER-A>, <BEFORE-A> to make tailoring easier using
rules similar to those in CLDR.
* localedata/locales/iso14651_t1_common: The new version of this
file downloaded from ISO contained several syntax errors which
are fixed by this patch.
[BZ #14095] - Review / update collation data from Unicode / ISO 14651
File downloaded from:
http://standards.iso.org/iso-iec/14651/ed-4/ISO14651_2016_TABLE1_en.txt
Updating this file alone is not enough, there are problems in the new
file which need to be fixed and the collation rules for many locales
need to be adapted. This is done by the following patches.
This update also fixes the problem that many characters are treated as
identical when sorting because they were not yet in the old
iso14651_t1_common file, see:
https://bugzilla.redhat.com/show_bug.cgi?id=1336308
- Infinite (∞) and empty set (∅) are treated as if they were the same character by sort and uniq
[BZ #14095]
* localedata/locales/iso14651_t1_common: Update file to
latest version from ISO (ISO14651_2016_TABLE1_en.txt).
LC_TIME in these 4 locales is identical, using “copy "es_BO"” makes
that more obvious.
[BZ #22646]
* localedata/locales/es_CL (LC_TIME): copy "es_BO".
* localedata/locales/es_CU (LC_TIME): copy "es_BO".
* localedata/locales/es_EC (LC_TIME): copy "es_BO".
[BZ #10871]
* localedata/locales/ru_RU (mon): Rename to...
(alt_mon): This.
(abmon): Rename to...
(ab_alt_mon): This.
(mon): Import from CLDR (genitive case).
(abmon): Copy from the old content except the 5th month which is
now in the genitive case, even when abbreviated.
* localedata/locales/ru_UA: Likewise.
* time/tst-strptime.c (day_tests): Add an actual example of
a difference between %b and %Ob in Russian.
Primary month names are in a genitive case now, alternative month names
are in a nominative case.
The alternative digits hack is no longer needed and has been removed.
[BZ #10871]
* localedata/locales/uk_UA (mon): Renamed to...
(alt_mon): This.
(alt_digits): "0" removed and then renamed to...
(mon): This.
(date_fmt): Definition changed not to use the alternative
digits hack.
[BZ #10871]
* localedata/locales/pl_PL: Alternative month names added,
primary month names are genitive now.
* time/tst-strptime.c (day_tests): Actually use a genitive case
of a month name in Polish language.
Reported-by: Robert Pluim <rpluim@gmail.com>
* localedata/locales/gu_IN (LC_IDENTIFICATION): Fix an obvious typo
in date: "2004-14-09" should be "2004-09-14".
* localedata/locales/lo_LA: Fix an obvious typo in date in the header:
"2003-15-09" should be "2003-09-15".
* localedata/locales/bho_NP (LC_IDENTIFICATION): Fix an obvious typo
in date: "2017-24-07" should be "2017-07-24".
* localedata/locales/mai_IN: Likewise.
* localedata/locales/mai_NP: Likewise.
The current date format prefixes one-digit days with a space, resulting
in ugly two spaces:
$ LC_ALL=hu_HU.UTF-8 date
2018. jan. 1., hétfő, 21:25:35 CET
^^
The official orthography rules doesn't contain an explicit rule about
this (which already gives no sane reason for double space), and an
implicit example of "1848. március 9." under bullet point 296 at
http://helyesiras.mta.hu/helyesiras/default/akh12 contains a single
space only. It's sure not convincing on an HTML page, but I confirm
that the official book edition (e.g.
https://www.libri.hu/en/konyv/a-magyar-helyesiras-szabalyai-32.html)
also contains a single space there.
[BZ #22657]
* localedata/locales/hu_HU (d_t_fmt): Avoid a leading space
before the day number which may produce a double space.
(date_fmt): Likewise.
[BZ #22524]
* localedata/Makefile: Add lt_LT.UTF-8 to test-input
and to the list of locales to be built for testing.
* localedata/lt_LT.UTF-8.in: New file for testing the collation.
* localedata/locales/lt_LT (LC_COLLATE): Use “copy "iso14651_t1"”
and build the collation rules upon that.
[BZ #22515]
* localedata/Makefile: Add hsb_DE.UTF-8 to test-input
and to the list of locales to be built for testing.
* localedata/hsb_DE.UTF-8.in: New file for testing the collation.
* localedata/locales/hsb_DE (LC_COLLATE): Use “copy "iso14651_t1"”
and build the collation rules upon that.
[BZ #22517]
* localedata/Makefile: Add et_EE.UTF-8 to test-input
and to the list of locales to be built for testing.
* localedata/et_EE.UTF-8.in: New file for testing the collation.
* localedata/locales/et_EE (LC_COLLATE): Use “copy "iso14651_t1"”
and build the collation rules upon that.
[BZ #22527]
* localedata/locales/tr_TR (LC_COLLATE): Base collation rules
on iso14651_t1. A test file localedata/tr_TR.UTF-8.in is already
available, this rewrite of the collation rules does reproduce
the test file in the same order.
[BZ #10580]
* localedata/locales/hr_HR (LC_TIME): Use two letters for the
digraphs in the month and day names. Using single code points for
digraphs is deprecated. While there are dedicated Unicode
codepoints, for the digraphs, these are included for backwards
compatibility and modern texts use a sequence of Basic Latin
characters. See: https://www.unicode.org/faq/ligature_digraph.html
This makes the month and day names agree exactly with CLDR now,
CLDR does not use the single code points for the digraphs either.
According to CLDR, collation rules for Serbian and Bosnian
should be the same as for Croatian.
[BZ #22534]
* localedata/Makefile: Add sr_RS.UTF-8 and bs_BA.UTF-8 to test-input
and to the list of locales to be built for testing.
* localedata/bs_BA.UTF-8.in: New file (same as hr_HR.UTF-8.in).
* localedata/sr_RS.UTF-8.in: New file (same as hr_HR.UTF-8.in).
* localedata/locales/bs_BA (LC_COLLATE): Use “copy "hr_HR"”.
* localedata/locales/sr_RS (LC_COLLATE): Use “copy "hr_HR"”.
[BZ #10580]
* localedata/locales/hr_HR (LC_COLLATE): Base collation rules on
iso14651_t1.
* localedata/locales/hr_HR (LC_TIME): Sync month and day names with
CLDR (except use ligatures for the digraphs, CLDR does not use
the ligatures), add first_workday, some fixes in the date and time
formats.
* localedata/locales/hr_HR (LC_CTYPE): Add transliteration rules
for Đ and đ.
* localedata/locales/hr_HR (LC_MONETARY): Change currency_symbol to
lower case. p_cs_precedes and n_cs_precedes should be 0 instead of 1.
Add int_p_cs_precedes and int_n_cs_precedes.
* localedata/locales/hr_HR (LC_NUMERIC): Change thousands_sep to
"<U202F>" (NARROW NO-BREAK SPACE) and grouping to 3;3 (Agrees with
LC_MONETARY now).
* localedata/locales/hr_HR (LC_TELEPHONE): Add tel_dom_fmt.
* localedata/locales/hr_HR (LC_NAME): Add name_mr, name_mrs, and
name_miss.
* localedata/locales/hr_HR (LC_ADDRESS): Add country_post, country_isbn,
and lang_lib. Change postal_fmt.
change
[BZ #17750]
* Makefile: add fr_CA.UTF-8 to test-input and LOCALES.
* localedata/fr_CA.UTF-8.in: New file with test data for backward
accents sorting.
* localedata/fr_FR.UTF-8.in: Fix test data for forward accents
sorting.
* localedata/locales/cs_CZ (LC_COLLATE): Remove “define DIACRIT_FORWARD”
* localedata/locales/de_DE (LC_COLLATE): Likewise.
* localedata/locales/hu_HU (LC_COLLATE): Likewise.
* localedata/locales/lb_LU (LC_COLLATE): Likewise.
* localedata/locales/yuw_PG (LC_COLLATE): Likewise.
* localedata/locales/fr_CA (LC_COLLATE): Add “define DIACRIT_BACKWARD”
* localedata/locales/iso14651_t1_common: Use “ifdef DIACRIT_FORWARD”
instead of “ifdef DIACRIT_BACKWARD”.
The only locale which currently needs backward accents sorting is fr_CA.
Therefore, forward accents sorting should be the default.
Before this patch, backwards accent sorting was the default and all
locales except fr_CA had to use
define DIACRIT_FORWARD
before
copy "iso14651_t1"
Most locales didn’t do that and thus got the inappropriate backwards accents sorting
by accident. Now only the fr_CA locale needs to use
define DIACRIT_BACKWARD
before
copy "iso14651_t1"
Original patch slightly modified by: Mike FABIAN <mfabian@redhat.com>
[BZ #22336]
* localedata/locales/cs_CZ (LC_COLLATE): Use “copy "iso14651_t1"”
and implement the collation rules for cs from CLDR on top of that.
* Makefile: Add cs_CZ.UTF-8 to test-input and to the list
of locales to be built for testing.
* cs_CZ.UTF-8.in: New file with test data to test the Czech sorting.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ #22469]
* localedata/locales/pl_PL (LC_COLLATE): Use “copy "iso14651_t1"”
and implement the collation rules for pl from CLDR on top of that.
* Makefile: Add pl_PL.UTF-8 to test-input and to the list
of locales to be built for testing.
* pl_PL.UTF-8.in: New file with test data to test the Polish sorting.
[BZ #15537]
* localedata/locales/lv_LV (LC_COLLATE): Fix collation by
using “copy "iso14651_t1"” and then implementing the
collation rules for lv from CLDR on top of that.
* Makefile: Add lv_LV.UTF-8 to test-input and to the list
of locales to be built for testing.
* lv_LV.UTF-8.in: New file with test data to test the Latvian
sorting.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Following the previous work by Carlos O'Donell the category of LC_CTYPE
is correctly set to "i18n:2012" rather than "unicode:2014" and the
i18n_ctype file is once again regenerated from scratch to make sure it
does not contain any manual additions except the copyright message.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* localedata/unicode-gen/gen_unicode_ctype.py (output_head):
category of LC_CTYPE set to "i18n:2012".
* localedata/locales/i18n_ctype: Regenerate.
[BZ #19485]
* localedata/locales/csb_PL (LC_TIME): Fix “abmon” for March
and use a better translation for March in “mon”.
* localedata/locales/csb_PL: Use more ASCII to improve the
readability of the source.
[BZ #13953]
* localedata/locales/km_KH: Use ASCII as much
as possible for better readability of the source and
remove useless comments.
* localedata/locales/km_KH (LC_TIME): Remove era stuff, it
was commented out and apparently wrong anyway because it was
using Lao characters. If Buddhist era should be used
for km_KH, a native speaker should write the correct formaat
for Khmer.
* localedata/locales/km_KH (LC_TIME): Add first_weekday 1
(According to CLDR, the first weekday for Cambodia is Sunday).
* localedata/locales/km_KH (LC_NAME): Remove name_mr and name_mrs
(These were using Lao characters which must be wrong. If we get
the correct data from a native speaker, we could add it back, until
then it is better not to have name_mr and name_mrs at all than
having it wrong).
[BZ #15260]
* localedata/locales/doi_IN (LC_MESSAGES): Match only for the
first letters of yesstr and nostr in yesexpr and noexpr,
not for the full words.
* localedata/locales/hne_IN (LC_MESSAGES): Likewise.
* localedata/locales/kok_IN (LC_MESSAGES): Likewise.
* localedata/locales/mr_IN (LC_MESSAGES): Likewise.
* localedata/locales/sat_IN (LC_MESSAGES): Likewise.
* localedata/locales/km_KH (LC_MESSAGES): Match also for the
first letters of yesstr and nostr in yesexpr and noexpr,
until now only English was matched in yesexpr and noexpr.
* localedata/locales/tl_PH (LC_MESSAGES): Use “copy "fil_PH"”
instead of “copy "en_US"”. CLDR has yesstr and nostr data for
fil but not for tl. As tl and fil are very similar, using fil
is probably better than using English.
Pablo was l10n/i18n coordinator back in the old days but MandrakeSoft is
dead now
* localedata/locales/br_FR (LC_IDENTIFICATON): Add
Thierry Vignaud <thierry.vignaud@gmail.com> as the contact
for the br_FR locale.
"Ket" is the the most used negative answer, as it's the negative answer
to a positively phrased question
It's used as it or with the verb ("Ne ran ket", ...)
As such, "Ket" is used in most translations.
"Nann" is less used as it's the negative answer to a negatively phrased
question
See https://en.wikipedia.org/wiki/Yes_and_no for explanations about
languages with 3 or 4 form systems.
We still keep "Nn" for short answers as:
- new learners are used to "Non" in french
- and they often misuses "Nann"
- for compatibility with english
[BZ #21706]
* localedata/locales/br_FR (LC_MESSAGES): Fix nostr.
After the transition to generating a distinct file for Unicode ctype
information e.g. i18n_ctype, the check target was left with the wrong
target name. This patch fixes the check target and regenerates the
files with more information than previously used, filling in the the
LC_IDENTIFICATION data.
Tested on x86_64 by regenerating from Unicode source files, and
running checks. Tested by subsequently rebuilding all locales.
No regressions in testsuite.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Reported-by: Rafal Luzynski <digitalfreak@lingonborough.com>
* localedata/locales/hi_IN (LC_MESSAGES): In yesexpr and noexpr,
also check for the first characters of yesstr and nostr.
* localedata/locales/kn_IN (LC_MESSAGES): Likewise.
* localedata/locales/ks_IN@devanagari (LC_MESSAGES): Likewise.
* localedata/locales/chr_US (LC_MESSAGES): In yesexpr and noexpr,
match also for the contents of yesstr and nostr. As the first letter
of yesstr and nostr is equal, checking only for the first letter
is not enough.
* localedata/locales/ug_CN (LC_MESSAGES): Fix noexpr and yesexpr
by including the first letters of nostr and yesexpr in the regexp.
Also make it more readable by using ASCII where possible.
* localedata/locales/te_IN (LC_MESSAGES): Fix noexpr by including
the first letter of nostr in the regexp. It agrees with CLDR now.
Also make it more readable by using ASCII where possible.
* localedata/locales/km_KH (LC_MESSAGES): Fix yestr and nostr.
The yesstr and nostr apparently came from CLDR. And CLDR has a bug there:
these strings contain a U+17D6 (which somewhat looks like a colon)
instead of a real colon to separate the full words for “yes”
and “no” from the single letter responses.
* localedata/locales/ka_GE (LC_MESSAGES): Fix yesexp to make
it agree with CLDR (include the first letter of yesstr).
Also make it more readable by using ASCII where possible.
* localedata/locales/mr_IN (LC_MESSAGES): Fix yesstr and nostr
and improve yesexpr and noexpr. The yesstr and nostr apparently
came from CLDR. And CLDR has a bug there: these strings contain
a U+0903 (which looks like a colon) instead of a real colon
to separate the full words for “yes” and “no” from the single
letter responses.
Using all characters of the full words for yes and no in yesexpr and noexpr
makes no sense here, especially not because the words for yes and no
share one character.
* localedata/locales/bn_BD (LC_MESSAGES): Use only the first
letters of the full yesstr and nostr in yesexpr and noexpr.
* localedata/locales/an_ES (LC_MESSAGES): Add yesstr and nostr.
* localedata/locales/an_ES (LC_ADDRESS): Add lang_term and lang_lib.
* localedata/locales/an_ES: Make source more readable by using ASCII
where possible.
* localedata/locales/tpi_PG (LC_MESSAGES): Fix yesexpr and noexpr
by adding the generic +1 and -0 as in all other locales.
* localedata/locales/tpi_PG (LC_TIME): Fix some typos in the month and
day names and make it more readable by using ASCII where possible.
[BZ #16777]
* localedata/locales/pl_PL (LC_MONETARY): Use U+202F as mon_thousands_sep
and improve readability by using more ASCII.
* localedata/locales/pl_PL (LC_NUMERIC): Use U+202F as thousands_sep
and improve readability by using more ASCII.
The Valencian (meridional Catalan) locale is basically a copy of the
Catalan locale. The point of having a separate locale is only for PO
translations. This locale is already provided by several distributions
and is already supported by various projects like LibreOffice, Mozilla,
Gnome, KDE.
Aurelien Jarno <aurelien@aurel32.net>
[BZ #2522]
* localedata/locales/ca_ES@valencia: New file.
* localedata/SUPPORTED: Add ca_ES@valencia/UTF-8.
CLDR uses this pattern as well.
[BZ #22019]
* localedata/locales/el_GR: Set n_cs_precedes to 0.
* localedata/locales/el_CY: copy "el_GR" because it is identical.
* stdlib/tst-strfmon_l.c: adapt test case.
The commit does the following things:
* Move non-transliteration Unicode generated data to i18n_ctype.
* Copy the i18n_ctype data into i18n and add transliteration.
In the future, any locale which needs Unicode LC_CTYPE data can
also just use `copy i18n_ctype` and get the base character classes
and maps without transliteration.
Tested by compiling all the locales and my prototype C.UTF-8 which
uses it.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
“Bengali” still remained in some comments in the bn_BD locale file,
in iso-639.def and in a test input file. Change it there as well.
“Bangla” is now used as the English name for this language in CLDR.
[BZ #14925]
* libio/tst-widetext.input: Change “Bengali” to “Bangla”.
* locale/iso-639.def: Change “Bengali” to “Bangla”.
* localedata/locales/bn_BD: “Bengali” was still used in some
comments. Change it to “Bangla”.
[BZ #15332]
* locales/es_CU (LC_MONETARY): use “,” for mon_decimal_point
and “.” for mon_thousands_sep (to agree with CLDR)
* locales/es_CU (LC_NUMERIC): Likewise.
[BZ #22038]
* locales/so_DJ (LC_TIME): Fix abday, abmon and
make t_fmt in the comment agree with the value of t_fmt.
* locales/so_ET (LC_TIME): Fix abday (From Axa to Axd)
* locales/so_KE (LC_TIME): Fix abday (From Axa to Axd)
* locales/so_SO (LC_TIME): Fix abday (From Axa to Axd)
[BZ #13805]
* locales/ru_RU (LC_MONETARY): Use “,” for mon_decimal_point
(to agree with CLDR).
* locales/ru_RU (LC_NUMERIC): Write mon_decimal_point in ASCII
for readability.
* locales/os_RU (LC_MONETARY): Copy from ru_RU,
makes it agree with CLDR.
Add locale for “Morisyen” which is also called “Mauritian Creole”
and is spoken in Mauritius.
[BZ #21971]
* localedata/SUPPORTED: Add mfe_MU/UTF-8.
* localedata/locales/mfe_MU: New File.
[BZ #21971]
* locale/iso-639.def: add Morisyen.
[BZ #14925]
* locales/bn_BD (LC_IDENTIFICATION): Change language name in
“title” and “language” from Bengali to Bangla.
* locales/bn_IN (LC_IDENTIFICATION): Likewise.
The custom stuff which was in LC_CTYPE of the km_KH locale seems
to be a very incomplete subset of what one gets by using
“copy "i18n"”. I cannot find anything special there which is not
in “copy "i18n"”, only lots of stuff which is missing.
[BZ #20008]
* locales/km_KH (LC_CTYPE): Use “copy "i18n"”.
[BZ #20482]
* locales/de_AT (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_BE (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_CH (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_DE (LC_TIME): Use readable ASCII in abday.
* locales/de_IT (LC_TIME): Use readable ASCII in abday.
* locales/de_LU (LC_TIME): Use 2 letter abbreviations in abday.
See also [BZ #20756].
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space,
typically the width of a thin space or a mid space.
U+2009 THIN SPACE.
Many languages use small gap as thousands separator.
Thousands separator should not be a plain space, but a narrow space.
And additionally, it is not allowed to wrap line in the middle of the
number.
Locale data were created in a deep age of 8-bit encodings, so most of
them use space (incorrect: it allows wrapping the line in the middle
of the number), or NBSP (better, but typographically incorrect: space
between groups is too wide).
Now UNICODE is widely supported, so we should leave legacy characters
in favor of correct UNICODE character.
UNICODE has a dedicated character for this purpose:
NNBSP
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space,
typically the width of a thin space or a mid space
The NNBSP exists since Unicode 3.0.
Use of NNBSP will prevent line wrapping in the midle of number and
improve readability of numbers.
[BZ #20756]
* locales/aa_DJ (LC_MONETARY): Replace space by NNBSP as thousands separator.
* locales/az_AZ (LC_MONETARY): Likewise.
* locales/be_BY (LC_MONETARY): Likewise.
* locales/be_BY@latin (LC_MONETARY): Likewise.
* locales/bg_BG (LC_MONETARY): Likewise.
* locales/bs_BA (LC_MONETARY): Likewise.
* locales/ce_RU (LC_MONETARY): Likewise.
* locales/crh_UA (LC_MONETARY): Likewise.
* locales/cs_CZ (LC_MONETARY): Likewise.
* locales/cs_CZ (LC_NUMERIC): Likewise.
* locales/cv_RU (LC_MONETARY): Likewise.
* locales/de_AT (LC_MONETARY): Likewise.
* locales/eo (LC_MONETARY): Likewise.
* locales/es_CR (LC_MONETARY): Likewise.
* locales/es_CR (LC_NUMERIC): Likewise.
* locales/es_CU (LC_MONETARY): Likewise.
* locales/et_EE (LC_MONETARY): Likewise.
* locales/et_EE (LC_NUMERIC): Likewise.
* locales/fi_FI (LC_MONETARY): Likewise.
* locales/fi_FI (LC_NUMERIC): Likewise.
* locales/fr_CA (LC_MONETARY): Likewise.
* locales/fr_FR (LC_MONETARY): Likewise.
* locales/fr_FR (LC_NUMERIC): Likewise.
* locales/fr_LU (LC_MONETARY): Likewise.
* locales/fr_LU (LC_NUMERIC): Likewise.
* locales/hr_HR (LC_MONETARY): Likewise.
* locales/ht_HT (LC_NUMERIC): Likewise.
* locales/kk_KZ (LC_MONETARY): Likewise.
* locales/kk_KZ (LC_NUMERIC): Likewise.
* locales/ky_KG (LC_MONETARY): Likewise.
* locales/ky_KG (LC_NUMERIC): Likewise.
* locales/lv_LV (LC_MONETARY): Likewise.
* locales/lv_LV (LC_NUMERIC): Likewise.
* locales/mg_MG (LC_MONETARY): Likewise.
* locales/mhr_RU (LC_MONETARY): Likewise.
* locales/mk_MK (LC_MONETARY): Likewise.
* locales/mk_MK (LC_NUMERIC): Likewise.
* locales/mn_MN (LC_MONETARY): Likewise.
* locales/nb_NO (LC_MONETARY): Likewise.
* locales/nb_NO (LC_NUMERIC): Likewise.
* locales/nl_AW (LC_MONETARY): Likewise.
* locales/nl_NL (LC_MONETARY): Likewise.
* locales/nn_NO (LC_MONETARY): Likewise.
* locales/os_RU (LC_MONETARY): Likewise.
* locales/pap_AW (LC_MONETARY): Likewise.
* locales/pap_CW (LC_MONETARY): Likewise.
* locales/ru_RU (LC_MONETARY): Likewise.
* locales/ru_RU (LC_NUMERIC): Likewise.
* locales/ru_UA (LC_MONETARY): Likewise.
* locales/sk_SK (LC_MONETARY): Likewise.
* locales/sk_SK (LC_NUMERIC): Likewise.
* locales/sl_SI (LC_MONETARY): Likewise.
* locales/sl_SI (LC_NUMERIC): Likewise.
* locales/sq_MK (LC_MONETARY): Likewise.
* locales/sv_SE (LC_MONETARY): Likewise.
* locales/sv_SE (LC_NUMERIC): Likewise.
* locales/tg_TJ (LC_MONETARY): Likewise.
* locales/tt_RU (LC_MONETARY): Likewise.
* locales/tt_RU@iqtelif (LC_MONETARY): Likewise.
* locales/uk_UA (LC_MONETARY): Likewise.
* locales/uk_UA (LC_NUMERIC): Likewise.
* locales/unm_US (LC_MONETARY): Likewise.
* locales/unm_US (LC_NUMERIC): Likewise.
* locales/wo_SN (LC_MONETARY): Likewise.
[BZ #17563]
[BZ #16905]
* locales/cmn_TW (LC_COLLATE): Use cns11643_stroke file for sorting.
* locales/cmn_TW (LC_TIME): Improve time and date formats.
* locales/cmn_TW (LC_MESSAGES): Add yesstr and nostr.
* locales/cns11643_stroke: New file, stroke count collation for
traditional Chinese.
These comments are useless and only confusing. The encodings used to
create binary locales from source locales are listed in the
localedata/SUPPORTED file. The source files itself are ASCII or UTF-8
encoded where non-ASCII UTF-8 is currently only used in comments. If
all locale source files are UTF-8 anyway, there is no need to specify
that in a special comment.
New locale is added for the Seychelles which is a member of the African
Union. English is an offical language for the Seychelles.
[BZ #21854]
* locales/en_SC: New file.
* localedata/SUPPORTED : Add en_SC/UTF-8.
For the locales doi_IN, kok_IN, and sat_IN, the words for
“yes” and “no” were apparently in yesexpr and noexpr.
Copy them from there to add yesstr and nostr.
Also make yesexpr and noexpr more readable by using
the POSIX portable character set.
* locales/doi_IN (LC_MESSAGES): Add yesstr and nostr.
* locales/kok_IN (LC_MESSAGES): Add yesstr and nostr.
* locales/sat_IN (LC_MESSAGES): Add yesstr and nostr.
This reverts commit 8f75515080
Revert “Fix yesexpr in en_DK locale”.
* locales/en_DK (LC_MESSAGES): Restore original yesexpr, noexpr,
yesstr, nostr. Convert them to ASCII and add a comment why
we want to have them like this.
And make the expressions more readable by using the POSIX portable character set
instead of Unicode code points.
* locales/agr_PE (LC_MESSAGES): drop .* from yesexpr and noexpr
* locales/az_IR (LC_MESSAGES): Improve yesexpr and noexpr.
* locales/az_IR (LC_ADDRESS): Fix typo in comment and
use the individual iso-639-3 code for South Azerbaijani
"azb" in lang_term.
* locales/az_IR (LC_NAME): Improve readability of name_fmt in source.
After the recent import of month names from CLDRv31 (bug 21217,
commit c853f14) an import of abbreviated month names is also needed
to make sure they match the full forms.
In case of kok_IN CLDR does not provide the abbreviated month names
explicitly but uses full month names in such cases so abmon section
has been copied from mon.
* localedata/locales/as_IN (abmon): Update from CLDR.
* localedata/locales/bn_BD (abmon): Likewise.
* localedata/locales/bn_IN (abmon): Likewise.
* localedata/locales/gu_IN (abmon): Likewise.
* localedata/locales/hi_IN (abmon): Likewise.
* localedata/locales/kn_IN (abmon): Likewise.
* localedata/locales/ml_IN (abmon): Likewise.
* localedata/locales/mr_IN (abmon): Likewise.
* localedata/locales/ne_NP (abmon): Likewise.
* localedata/locales/or_IN (abmon): Likewise.
* localedata/locales/pa_IN (abmon): Likewise.
* localedata/locales/ta_IN (abmon): Likewise.
* localedata/locales/te_IN (abmon): Likewise.
* localedata/locales/kok_IN (abmon): Likewise but copied from mon.
Maithili which is an official language not only in India but in Nepal as well.
https://en.wikipedia.org/wiki/Maithili_language
Reference is taken form mai_IN.
[BZ #21835]
* localedata/locales/mai_NP: New file.
* localedata/SUPPORTED: Add mai_NP/UTF-8.
After the recent update of int_select the comment needed an update, too.
While at this, all comments in LC_TELEPHONE were moved above their
respective values because this looks better. Some minor typos fixed.
[BZ #21783]
* localedata/locales/lg_UG (LC_TELEPHONE): Move all comments
above the values, correct some of them.
During Hindi Locale review I found many fields are incorrect
[BZ #21729]
* locales/hi_IN (LC_NAME): Fix name_mr, name_mrs, name_miss, name_ms
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
During Locale verification I observed that
yesstr and nostr are missing for Xhosa language locale
for South Africa
[BZ #21724]
* locales/xh_ZA (LC_MESSAGES): add yesstr and nostr
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
During Locale verification I observed that
Incorrect Full Weekday names for ks_IN@devanagari
Reference is taken from
http://www.mkraina.com/PDF/3-Self-authored%20Works%20(English)/15.pdf
And kashmiri devanagari travel book and other sources
[BZ #21721]
* locales/ks_IN@devanagari: Full weekday name Fix.
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
After the recent import of month names from CLDRv31 (bug 21217,
commit c853f14) more imports are also needed, mostly abbreviated month
names.
This patch also updates May (full month name) in ps_AF which was
skipped in the previous patch.
Incidentally, this import fixes bug 17225 (ar_SY) and partially
bug 19066 (ar_SA).
CLDR currently has a bug in the full month name for October for ar_IQ, see
http://unicode.org/cldr/trac/ticket/10460
* localedata/locales/ar_DZ (abmon): Full import from CLDR, abmon
is no longer abbreviated.
* localedata/locales/ar_IQ (abmon): Likewise.
* localedata/locales/ar_MA (abmon): Likewise.
* localedata/locales/ar_TN (abmon): Likewise.
* localedata/locales/ps_AF (abmon): Likewise.
* localedata/locales/ug_CN (abmon): Likewise.
* localedata/locales/ar_SA (abmon): Likewise, partially
fixes bug 19066.
* localedata/locales/ks_IN (abmon): A copy of mon.
* localedata/locales/ur_IN (abmon): Oct reworded "اكتوبر" to
"اکتوبر" (same change as mon).
* localedata/locales/ur_PK (abmon): Same changes as mon applied.
* localedata/locales/ps_AF (mon): May reworded "می" to "مۍ".
[BZ #17225]
* localedata/locales/ar_SY (abmon): May reworded "نوار" to
"أيار", this closes bug 17225.
* localedata/locales/ar_JO (abmon): Likewise.
* localedata/locales/ar_LB (abmon): Likewise.
[BZ #21711]
During Locale verification I observed that
yesstr and nostr are missing for Pashto [LC_MESSAGES] Locale
For Afghanistan reference google translate and Pashto travel book.
[BZ #21706]
During Locale verification i observed that
yesstr and nostr are missing for Breton [LC_MESSAGES] locale
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
After the recent import of month names from CLDR (bug 21217) more
imports are also needed, mostly abbreviated month names.
* localedata/locales/br_FR (abmon): Reworded "Eve " to "Mezh".
* localedata/locales/fy_NL (abmon): Reworded "Maa" (March) to
"Mrt" and "Maa" (May) to "Mai".
* localedata/locales/lg_UG (abmon): Reworded "Jun" to "Juu".
* localedata/locales/ln_CD (abmon): "yan", "fbl", "msi",
and so on.
* localedata/locales/mn_MN (abmon): "1-р сар", "2-р сар",
"3-р сар", and so on.
* localedata/locales/vi_VN (abmon): Reworded "Th01" to "Thg 1",
"Th02" to "Thg 2" and so on.
* localedata/locales/yo_NG (abday): "Àìkú", "Ajé", "Ìsẹ́gun",
and so on, also comment updated to match the new content.
(day): "Ọjọ́ Àìkú", "Ọjọ́ Ajé", "Ọjọ́ Ìsẹ́gun", and so on.
(abmon): "Ṣẹ́rẹ́", "Èrèlè", "Ẹrẹ̀nà", and so on.
(mon): Comment updated to match the actual content.
(d_t_fmt): Changed "%A" to "%a" and "%B" to "%b".
* localedata/locales/zu_ZA (abmon): "Jan", "Feb", "Mas",
and so on, also comment updated to match the new content.
(mon): comment updated to match the actual content.
This updates a bunch of locales based on CLDR v29 data:
az_AZ: changing Azərbaycanca to azərbaycan dili
be_BY: changing беларуская мова to беларуская
bem_ZM: changing iciBemba to Ichibemba
bg_BG: changing български език to български
bo_CN: changing པོད་སྐད་ to བོད་སྐད་
bo_IN: changing པོད་སྐད་ to བོད་སྐད་
br_FR: changing Brezhoneg to brezhoneg
brx_IN: lang_name: setting to बड़ो
ce_RU: changing нохчийн мотт to нохчийн
cs_CZ: changing Čeština to čeština
dz_BT: changing (རྫོང་ཁ to རྫོང་ཁ
el_CY: changing ελληνικά to Ελληνικά
el_GR: changing ελληνικά to Ελληνικά
es_AR: changing Español to español
es_BO: changing Español to español
es_CL: changing Español to español
es_CO: changing Español to español
es_CR: changing Español to español
es_CU: changing Español to español
es_DO: changing Español to español
es_EC: changing Español to español
es_ES: changing Español to español
es_GT: changing Español to español
es_HN: changing Español to español
es_MX: changing Español to español
es_NI: changing Español to español
es_PA: changing Español to español
es_PE: changing Español to español
es_PR: changing Español to español
es_PY: changing Español to español
es_SV: changing Español to español
es_US: changing Español to español
es_UY: changing Español to español
es_VE: changing Español to español
et_EE: changing eesti keel to eesti
eu_ES: changing Euskara to euskara
fr_BE: changing Français to français
fr_CA: changing Français to français
fr_CH: changing Français to français
fr_FR: changing Français to français
fr_LU: changing Français to français
fur_IT: changing Furlan to furlan
fy_NL: changing Frysk to West-Frysk
gl_ES: changing Galego to galego
gv_GB: changing y Ghaelg to Gaelg
he_IL: lang_name: setting to עברית
hsb_DE: changing Hornjoserbšćina to hornjoserbšćina
hy_AM: changing Հայերեն to հայերեն
id_ID: changing Bahasa Indonesia to Bahasa Indonesia
it_CH: changing Italiano to italiano
it_IT: changing Italiano to italiano
kl_GL: changing Kalaallisut to kalaallisut
km_KH: changing ភាសាខ្មែរ to ខ្មែរ
ko_KR: changing 한국말 to 한국어
ks_IN: changing kạ̄šur to کٲشُر
kw_GB: changing Kernowek to kernewek
ky_KG: changing Кыргызча to кыргызча
lg_UG: changing Oluganda to Luganda
lt_LT: changing lietuvių kalba to lietuvių
lv_LV: changing latviešu valoda to latviešu
mk_MK: changing македонск/и јазик to македонски
mn_MN: changing Монгол хэл to монгол
nb_NO: changing Bokmål to norsk bokmål
nn_NO: changing Nynorsk to nynorsk
os_RU: lang_name: setting to ирон
ru_RU: lang_name: setting to русский
ru_UA: lang_name: setting to русский
se_NO: changing Davvisámegiella to davvisámegiella
sk_SK: lang_name: setting to slovenčina
ta_IN: lang_name: setting to தமிழ்
ta_LK: lang_name: setting to தமிழ்
tk_TM: changing Türkmençe to türkmençe
tr_CY: changing Turkish to Türkçe
tr_TR: changing Turkish to Türkçe
ur_IN: lang_name: setting to {اردو}
ur_PK: lang_name: setting to {اردو}
vi_VN: changing Việt ngữ to Tiếng Việt
yo_NG: changing Yorùbá to Èdè Yorùbá
zu_ZA: changing IsiZulu to isiZulu
Most of these are simple case changes, but they match the CLDR db.
A search for a few of the others suggests they're also correct.
[BZ #21217]
* localedata/locales/ln_CD (mon): Months imported from CLDR, all changed:
"Yanwáli" to "sánzá ya yambo", "Febwáli" to "sánzá ya míbalé" and so on.
* locales/mn_MN (mon): reworded "Хулгана сарын" to "Нэгдүгээр сар",
"Үхэр сарын" to "Хоёрдугаар сар", and so on.
* locales/vi_VN (mon): reworded "Tháng một" to "Tháng 1", "Tháng hai"
to "Tháng 2", "Tháng ba" to "Tháng 3", and so on.
* locales/yo_NG (mon): reworded "Jánúárì" to "Oṣù Ṣẹ́rẹ́", "Fẹ́búárì"
to "Oṣù Èrèlè", "Máàṣì" to "Oṣù Ẹrẹ̀nà", and so on.
* locales/zu_ZA (mon): reworded "uMasingana" to "Januwari",
"uNhlolanja" to "Februwari", "uNdasa" to "Mashi", and so on.
[BZ #21217]
* localedata/locales/be_BY (mon, abmon): Reworded "Травень" ("travyen'")
and abbreviated "Тра" ("tra") to "Май" ("may").
* localedata/locales/be_BY@latin (mon, abmon): Likewise, "Travień"
and abbreviated "Tra" reworded to "Maj".
* localedata/locales/br_FR (day, abmon, mon, d_t_fmt): Use the proper
Unicode apostrophe (Cʼhwevrer, mercʼher, Dʼar).
* localedata/locales/es_PE (mon, abmon): Reworded "septiembre" to
"setiembre" and abbreviated "sep" to "set".
* localedata/locales/es_UY: Likewise.
* localedata/locales/fil_PH (mon): Reworded "Septiyembre" to "Setyembre"
and "Nobiyembre" to "Nobyembre".
* localedata/locales/fur_IT (mon, abmon): Reworded "Decembar" to
"Dicembar" and abbreviated "Dec" to "Dic".
* localedata/locales/fy_NL (mon): Reworded "Janaris" to "Jannewaris".
* localedata/locales/ha_NG (mon, abmon): Reworded "Fabrairu" to
"Faburairu" and "Afrilu" to "Afirilu", also abbreviated "Afr" to "Afi".
* localedata/locales/ig_NG (mon, abmon): All months begin with
uppercase. Reworded: "febụrụwarị" to "Febrụwarị", "epreel" to "Eprel",
"ọgọstụ" to "Ọgọọst", "nọvemba" to "Novemba", "nọv" to "Nov".
* localedata/locales/kw_GB (mon): Months imported from CLDR, many
changes: all "Mys" to "mis", "Mys Whevrel" to "mis Hwevrer", "Mys Merth"
to "mis Meurth", "Mys Evan" to "mis Metheven", and so on.
(abmon): "Whe>" to "Hwe", "Mer" to "Meu", "Evn" to "Met".
* localedata/locales/lg_UG (mon): Reworded "Julaai" to "Julaayi".
* localedata/locales/ln_CD (mon): Months imported from CLDR, all changed:
"Yanwáli" to "sánzá ya yambo", "Febwáli" to "sánzá ya míbalé" and so on.
* localedata/locales/mg_MG (mon, abmon): All months now begin with
uppercase.
* localedata/locales/se_NO (mon): Months imported from CLDR, all
suffixes changed from "mánu" to "mánnu".
* localedata/locales/sr_RS@latin (mon): Reworded "juni" to "jun" and
"juli" to "jul".
* localedata/locales/uz_UZ (mon): Reworded "Sentyabr" to "Sentabr"
and "Oktyabr" to "Oktabr".
* localedata/locales/uz_UZ@cyrillic (mon): Most of the names reworded,
no longer end with a soft sign.
* localedata/locales/wae_CH (mon): reworded "Bráchet" to "Bráčet",
"Öigschte" to "Öigšte", "Herbschtmánet" to "Herbštmánet",
"Chrischtmánet" to "Chrištmánet".
* Unicode 10.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 10.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
[BZ #21207]
* locales/ce_RU (day): Updated (imported) from CLDR. Uppercase letters
left unchanged.
* locales/ce_RU (abday): Minor updates to match (day): Latin uppercase
"I" replaced with Cyrillic "Ӏ" ("Palochka", Unicode: U04C0). Trailing
spaces removed.
Fix the incorrect sorting order of a digraph and its geminated variant,
regression introduced by a faulty fix to bug 13547 in commit
b008d4c856.
Fix two inconsistencies in sorting unusual capitalization of digraphs
(bug #18587).
Enable DIACRIT_FORWARD to work around bug #17750.
Sort foreign accents after the Hungarian ones.
Add extensive unittests containing all the examples from The Rules of
Hungarian Orthography and many more, including explanatory comments.
* Unicode 9.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 9.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
Both regexes end with a "*." which means the previous match can be
omitted, and then the . allows them to match any input at all.
This means tools like coreutils' `rm -i` will always delete things
when prompted because the yesexpr regex matches all inputs (even
the negative ones).
The standard currently in effect (LST ISO 8601:1997) mandates the use
of hyphens (as opposed to full stops, currently) in date formats. It
also matches current CLDR data (v29), Wikipedia's & Wikia's settings,
and Microsoft's Lithuanian Style Guide.
According to "Requirements of information technology in Estonian
language and cultural environment" the monetary symbol should be
written after the amount number:
https://www.evs.ee/products/evs-8-2008
The date format in en_CA/LC_TIME specifies the date format as "%d/%m/%y".
However, it should be "%Y-%m-%d". This is the standard date format in
Canada as specified by the Canadian Standards Association in CSA Z234.5:1989,
which adopts the ISO 8601 standard.
Here's the web page from the National Research Council of Canada
citing ISO 8601 as the standard date/time format in Canada:
http://www.nrc-cnrc.gc.ca/eng/services/time/faq/#Q8
International Standard ISO 8601 specifies numeric representations
of date and time. The recommended full format is of the form
2001-12-31 23:59:28.73 UTC. The intent of this standard is to avoid
confusion in international communications which can arise with the
many different national notations. This format has the advantage
that it permits dates to be readily sorted in chronological order
by computer systems.
Windows 8+ and OS X also switched to this format.
Fix the postal_fmt and country_name entries to continue on the following
line without indentation.
localedata/Changelog:
* locales/de_LI (postal_fmt): Fix indentation.
(country_name): Likewise.