[BZ #17750]
* Makefile: add fr_CA.UTF-8 to test-input and LOCALES.
* localedata/fr_CA.UTF-8.in: New file with test data for backward
accents sorting.
* localedata/fr_FR.UTF-8.in: Fix test data for forward accents
sorting.
* localedata/locales/cs_CZ (LC_COLLATE): Remove “define DIACRIT_FORWARD”
* localedata/locales/de_DE (LC_COLLATE): Likewise.
* localedata/locales/hu_HU (LC_COLLATE): Likewise.
* localedata/locales/lb_LU (LC_COLLATE): Likewise.
* localedata/locales/yuw_PG (LC_COLLATE): Likewise.
* localedata/locales/fr_CA (LC_COLLATE): Add “define DIACRIT_BACKWARD”
* localedata/locales/iso14651_t1_common: Use “ifdef DIACRIT_FORWARD”
instead of “ifdef DIACRIT_BACKWARD”.
The only locale which currently needs backward accents sorting is fr_CA.
Therefore, forward accents sorting should be the default.
Before this patch, backwards accent sorting was the default and all
locales except fr_CA had to use
define DIACRIT_FORWARD
before
copy "iso14651_t1"
Most locales didn’t do that and thus got the inappropriate backwards accents sorting
by accident. Now only the fr_CA locale needs to use
define DIACRIT_BACKWARD
before
copy "iso14651_t1"
Original patch slightly modified by: Mike FABIAN <mfabian@redhat.com>
The LOCALES variable in the localedata had two instances of cs_CZ
which generated the following warning:
../gen-locales.mk:11: target '/opt/build/localedata/cs_CZ.UTF-8/LC_CTYPE' given more than once in the same rule
Dropped the duplicate entry.
[BZ #22336]
* localedata/locales/cs_CZ (LC_COLLATE): Use “copy "iso14651_t1"”
and implement the collation rules for cs from CLDR on top of that.
* Makefile: Add cs_CZ.UTF-8 to test-input and to the list
of locales to be built for testing.
* cs_CZ.UTF-8.in: New file with test data to test the Czech sorting.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
[BZ #22469]
* localedata/locales/pl_PL (LC_COLLATE): Use “copy "iso14651_t1"”
and implement the collation rules for pl from CLDR on top of that.
* Makefile: Add pl_PL.UTF-8 to test-input and to the list
of locales to be built for testing.
* pl_PL.UTF-8.in: New file with test data to test the Polish sorting.
[BZ #15537]
* localedata/locales/lv_LV (LC_COLLATE): Fix collation by
using “copy "iso14651_t1"” and then implementing the
collation rules for lv from CLDR on top of that.
* Makefile: Add lv_LV.UTF-8 to test-input and to the list
of locales to be built for testing.
* lv_LV.UTF-8.in: New file with test data to test the Latvian
sorting.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Update all sourceware links to https. The website redirects
everything to https anyway so let the web server do a bit less work.
The only reference that remains unchanged is the one in the old
ChangeLog, since it didn't seem worth changing it.
* NEWS: Update sourceware link to https.
* configure.ac: Likewise.
* crypt/md5test-giant.c: Likewise.
* dlfcn/bug-atexit1.c: Likewise.
* dlfcn/bug-atexit2.c: Likewise.
* localedata/README: Likewise.
* malloc/tst-mallocfork.c: Likewise.
* manual/install.texi: Likewise.
* nptl/tst-pthread-getattr.c: Likewise.
* stdio-common/tst-fgets.c: Likewise.
* stdio-common/tst-fwrite.c: Likewise.
* sunrpc/Makefile: Likewise.
* sysdeps/arm/armv7/multiarch/memcpy_impl.S: Likewise.
* wcsmbs/tst-mbrtowc2.c: Likewise.
* configure: Regenerate.
* INSTALL: Regenerate.
Following the previous work by Carlos O'Donell the category of LC_CTYPE
is correctly set to "i18n:2012" rather than "unicode:2014" and the
i18n_ctype file is once again regenerated from scratch to make sure it
does not contain any manual additions except the copyright message.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* localedata/unicode-gen/gen_unicode_ctype.py (output_head):
category of LC_CTYPE set to "i18n:2012".
* localedata/locales/i18n_ctype: Regenerate.
[BZ #19485]
* localedata/locales/csb_PL (LC_TIME): Fix “abmon” for March
and use a better translation for March in “mon”.
* localedata/locales/csb_PL: Use more ASCII to improve the
readability of the source.
[BZ #13953]
* localedata/locales/km_KH: Use ASCII as much
as possible for better readability of the source and
remove useless comments.
* localedata/locales/km_KH (LC_TIME): Remove era stuff, it
was commented out and apparently wrong anyway because it was
using Lao characters. If Buddhist era should be used
for km_KH, a native speaker should write the correct formaat
for Khmer.
* localedata/locales/km_KH (LC_TIME): Add first_weekday 1
(According to CLDR, the first weekday for Cambodia is Sunday).
* localedata/locales/km_KH (LC_NAME): Remove name_mr and name_mrs
(These were using Lao characters which must be wrong. If we get
the correct data from a native speaker, we could add it back, until
then it is better not to have name_mr and name_mrs at all than
having it wrong).
[BZ #15260]
* localedata/locales/doi_IN (LC_MESSAGES): Match only for the
first letters of yesstr and nostr in yesexpr and noexpr,
not for the full words.
* localedata/locales/hne_IN (LC_MESSAGES): Likewise.
* localedata/locales/kok_IN (LC_MESSAGES): Likewise.
* localedata/locales/mr_IN (LC_MESSAGES): Likewise.
* localedata/locales/sat_IN (LC_MESSAGES): Likewise.
* localedata/locales/km_KH (LC_MESSAGES): Match also for the
first letters of yesstr and nostr in yesexpr and noexpr,
until now only English was matched in yesexpr and noexpr.
* localedata/locales/tl_PH (LC_MESSAGES): Use “copy "fil_PH"”
instead of “copy "en_US"”. CLDR has yesstr and nostr data for
fil but not for tl. As tl and fil are very similar, using fil
is probably better than using English.
Pablo was l10n/i18n coordinator back in the old days but MandrakeSoft is
dead now
* localedata/locales/br_FR (LC_IDENTIFICATON): Add
Thierry Vignaud <thierry.vignaud@gmail.com> as the contact
for the br_FR locale.
"Ket" is the the most used negative answer, as it's the negative answer
to a positively phrased question
It's used as it or with the verb ("Ne ran ket", ...)
As such, "Ket" is used in most translations.
"Nann" is less used as it's the negative answer to a negatively phrased
question
See https://en.wikipedia.org/wiki/Yes_and_no for explanations about
languages with 3 or 4 form systems.
We still keep "Nn" for short answers as:
- new learners are used to "Non" in french
- and they often misuses "Nann"
- for compatibility with english
[BZ #21706]
* localedata/locales/br_FR (LC_MESSAGES): Fix nostr.
From localedef --help:
Output control:
...
--no-warnings=<warnings> Comma-separated list of warnings to disable;
supported warnings are: ascii, intcurrsym
...
--warnings=<warnings> Comma-separated list of warnings to enable;
supported warnings are: ascii, intcurrsym
Locales using SHIFT_JIS and SHIFT_JISX0213 character maps are not ASCII
compatible. In order to build locales using these character maps, and
have localedef exit with a status of 0, we add new option to localedef
to disable or enable specific warnings. The options are --no-warnings
and --warnings, to disable and enable specific warnings respectively.
The options take a comma-separated list of warning names. The warning
names are taken directly from the generated warning. When a warning
that can be disabled is issued it will print something like this: foo is
not defined [--no-warnings=foo]
For the initial implementation we add two controllable warnings; first
'ascii' which is used by the localedata installation makefile target to
install SHIFT_JIS and SHIFT_JISX0213-using locales without error; second
'intcurrsym' which allows a program to use a non-standard international
currency symbol without triggering a warning. The 'intcurrsym' is
useful in the future if country codes are added that are not in our
current ISO 4217 list, and the user wants to avoid the warning. Having
at least two warnings to control gives an example for how the changes
can be extended to more warnings if required in the future.
These changes allow ja_JP.SHIFT_JIS and ja_JP.SHIFT_JISX0213 to be
compiled without warnings using --no-warnings=ascii. The
localedata/Makefile $(INSTALL-SUPPORTED-LOCALES) target is adjusted to
automatically add `--no-warnings=ascii` for such charmaps, and likewise
localedata/gen-locale.sh is adjusted with similar logic.
v2: Bring verbose, be_quiet, and all warning control booleans into
record-status.c, and compile this object file to be used by locale,
iconv, and localedef. Any users include record-status.h.
v3: Fix an instance of boolean coercion in set_warning().
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
The localedata collation test data is encoded in a particular
character set. We rename the test data to match the full locale
name with encoding, and adjust the Makefile and sort-test.sh
script. This allows us to have a future C.UTF-8 test that is
disambiguated from the built-in C locale.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
After the transition to generating a distinct file for Unicode ctype
information e.g. i18n_ctype, the check target was left with the wrong
target name. This patch fixes the check target and regenerates the
files with more information than previously used, filling in the the
LC_IDENTIFICATION data.
Tested on x86_64 by regenerating from Unicode source files, and
running checks. Tested by subsequently rebuilding all locales.
No regressions in testsuite.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Reported-by: Rafal Luzynski <digitalfreak@lingonborough.com>
* localedata/locales/hi_IN (LC_MESSAGES): In yesexpr and noexpr,
also check for the first characters of yesstr and nostr.
* localedata/locales/kn_IN (LC_MESSAGES): Likewise.
* localedata/locales/ks_IN@devanagari (LC_MESSAGES): Likewise.
* localedata/locales/chr_US (LC_MESSAGES): In yesexpr and noexpr,
match also for the contents of yesstr and nostr. As the first letter
of yesstr and nostr is equal, checking only for the first letter
is not enough.
* localedata/locales/ug_CN (LC_MESSAGES): Fix noexpr and yesexpr
by including the first letters of nostr and yesexpr in the regexp.
Also make it more readable by using ASCII where possible.
* localedata/locales/te_IN (LC_MESSAGES): Fix noexpr by including
the first letter of nostr in the regexp. It agrees with CLDR now.
Also make it more readable by using ASCII where possible.
* localedata/locales/km_KH (LC_MESSAGES): Fix yestr and nostr.
The yesstr and nostr apparently came from CLDR. And CLDR has a bug there:
these strings contain a U+17D6 (which somewhat looks like a colon)
instead of a real colon to separate the full words for “yes”
and “no” from the single letter responses.
* localedata/locales/ka_GE (LC_MESSAGES): Fix yesexp to make
it agree with CLDR (include the first letter of yesstr).
Also make it more readable by using ASCII where possible.
* localedata/locales/mr_IN (LC_MESSAGES): Fix yesstr and nostr
and improve yesexpr and noexpr. The yesstr and nostr apparently
came from CLDR. And CLDR has a bug there: these strings contain
a U+0903 (which looks like a colon) instead of a real colon
to separate the full words for “yes” and “no” from the single
letter responses.
Using all characters of the full words for yes and no in yesexpr and noexpr
makes no sense here, especially not because the words for yes and no
share one character.
* localedata/locales/bn_BD (LC_MESSAGES): Use only the first
letters of the full yesstr and nostr in yesexpr and noexpr.
* localedata/locales/an_ES (LC_MESSAGES): Add yesstr and nostr.
* localedata/locales/an_ES (LC_ADDRESS): Add lang_term and lang_lib.
* localedata/locales/an_ES: Make source more readable by using ASCII
where possible.
* localedata/locales/tpi_PG (LC_MESSAGES): Fix yesexpr and noexpr
by adding the generic +1 and -0 as in all other locales.
* localedata/locales/tpi_PG (LC_TIME): Fix some typos in the month and
day names and make it more readable by using ASCII where possible.
[BZ #16777]
* localedata/locales/pl_PL (LC_MONETARY): Use U+202F as mon_thousands_sep
and improve readability by using more ASCII.
* localedata/locales/pl_PL (LC_NUMERIC): Use U+202F as thousands_sep
and improve readability by using more ASCII.
The Valencian (meridional Catalan) locale is basically a copy of the
Catalan locale. The point of having a separate locale is only for PO
translations. This locale is already provided by several distributions
and is already supported by various projects like LibreOffice, Mozilla,
Gnome, KDE.
Aurelien Jarno <aurelien@aurel32.net>
[BZ #2522]
* localedata/locales/ca_ES@valencia: New file.
* localedata/SUPPORTED: Add ca_ES@valencia/UTF-8.
CLDR uses this pattern as well.
[BZ #22019]
* localedata/locales/el_GR: Set n_cs_precedes to 0.
* localedata/locales/el_CY: copy "el_GR" because it is identical.
* stdlib/tst-strfmon_l.c: adapt test case.
The error and warning handling in localedef, locale, and iconv
is a bit of a mess.
We use ugly constructs like this:
WITH_CUR_LOCALE (error (1, errno, gettext ("\
cannot read character map directory `%s'"), directory));
to issue errors, and read error_message_count directly from the
error API to detect errors. The problem with that is that the
code also uses error to print warnings, and informative messages.
All of this leads to problems where just having warnings will
produce an exit status as-if errors had been seen.
To fix this situation I have adopted the following high-level
changes:
* All errors are counted distinctly.
* All warnings are counted distinctly.
* All informative messages are not counted.
* Increasing verbosity cannot generate *more* errors, and
it previously did for errors conditional on verbose,
this is now fixed.
* Increasing verbosity *can* generate *more* warnings.
* Making the output quiet cannot generate *fewer* errors,
and it previously did for errors conditional on be_quiet,
this is now fixed.
* Each of error, warning, and informative message has it's
own function to call defined in record-status.h, and they
are: record_error, record_warning, and record_verbose.
* The record_error function always records an error, but
conditional on be_quiet may not print it.
* The record_warning function always records a warning,
but conditional on be_quiet may not print it.
* The record_verbose function only prints the verbose
message if verbose is true and be_quiet is false.
This has allowed the following fix:
* Previously any warnings were being treated as errors
because they incremented error_message_count, but now
we properly return an exit status of 1 if there are
warnings but output was generated.
All of this allows localedef to correctly decide if errors,
or warnings were present, and produce the correct exit code.
The locale and iconv programs now also use record-status.h
and we have removed the WITH_CUR_LOCALE hack, and instead
have internal push_locale/pop_locale functions centralized
in the record routines.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
The commit does the following things:
* Move non-transliteration Unicode generated data to i18n_ctype.
* Copy the i18n_ctype data into i18n and add transliteration.
In the future, any locale which needs Unicode LC_CTYPE data can
also just use `copy i18n_ctype` and get the base character classes
and maps without transliteration.
Tested by compiling all the locales and my prototype C.UTF-8 which
uses it.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
This code page is identical to code page 850 except that X'D5'
has been changed from LI61 (dotless i) to SC20 (euro symbol).
The code points from /x01 to /x1f in the /localedata/charmaps/IBM858
file have the same mapping as those in localedata/charmaps/ANSI_X3.4-1968.
That means they disagree with with
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00858.txt
in that range.
For example, localedata/charmaps/IBM858 and localedata/charmaps/ANSI_X3.4-1968 have:
“<U0001> /x01 START OF HEADING (SOH)”
whereas CP00858.txt has:
“01 SS000000 Smiling Face”
That means that CP00858.txt is not really ASCII-compatible and to make
it ASCII-compatible we deviate fro CP00858.txt in the code points from /x01
to /x1f.
[BZ #21084]
* benchtests/strcoll-inputs/filelist#en_US.UTF-8: Add IBM858 and ibm858.c.
* iconvdata/Makefile: Add IBM858.
* iconvdata/gconv-modules: Add IBM858.
* iconvdata/ibm858.c: New file.
* iconvdata/tst-tables.sh: Add IBM858
* localedata/charmaps/IBM858: New file.
“Bengali” still remained in some comments in the bn_BD locale file,
in iso-639.def and in a test input file. Change it there as well.
“Bangla” is now used as the English name for this language in CLDR.
[BZ #14925]
* libio/tst-widetext.input: Change “Bengali” to “Bangla”.
* locale/iso-639.def: Change “Bengali” to “Bangla”.
* localedata/locales/bn_BD: “Bengali” was still used in some
comments. Change it to “Bangla”.
[BZ #22070]
* localedata/unicode-gen/utf8_gen.py: Set the width for
characters with Prepended_Concatenation_Mark property to 1
* localedata/charmaps/UTF-8: Updated using the improved script.
Writing ranges of neighbouring characters with the same with like this
<U000E0100>...<U000E01EF> 0
in charmaps/UTF-8 is more efficient than writing many single character lines
like:
<U000E0100> 0
<U000E0101> 0
...
[BZ #21750]
* unicode-gen/utf8_gen.py: Write all ranges of neighbouring characters
with the same width using the range notation in charmaps/UTF-8.
[BZ #15332]
* locales/es_CU (LC_MONETARY): use “,” for mon_decimal_point
and “.” for mon_thousands_sep (to agree with CLDR)
* locales/es_CU (LC_NUMERIC): Likewise.
[BZ #22038]
* locales/so_DJ (LC_TIME): Fix abday, abmon and
make t_fmt in the comment agree with the value of t_fmt.
* locales/so_ET (LC_TIME): Fix abday (From Axa to Axd)
* locales/so_KE (LC_TIME): Fix abday (From Axa to Axd)
* locales/so_SO (LC_TIME): Fix abday (From Axa to Axd)
[BZ #13805]
* locales/ru_RU (LC_MONETARY): Use “,” for mon_decimal_point
(to agree with CLDR).
* locales/ru_RU (LC_NUMERIC): Write mon_decimal_point in ASCII
for readability.
* locales/os_RU (LC_MONETARY): Copy from ru_RU,
makes it agree with CLDR.
Add locale for “Morisyen” which is also called “Mauritian Creole”
and is spoken in Mauritius.
[BZ #21971]
* localedata/SUPPORTED: Add mfe_MU/UTF-8.
* localedata/locales/mfe_MU: New File.
[BZ #21971]
* locale/iso-639.def: add Morisyen.
[BZ #21750]
* unicode-gen/utf8_gen.py (U+00AD): Set width to 1.
* unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0.
* unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2.
* unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise.
[BZ #19852]
[BZ #21750]
* unicode-gen/utf8_gen.py: Process EastAsianWidth lines before
UnicodeData lines so the latter have precedence; remove hack
to group output by EastAsianWidth ranges.
[BZ #14925]
* locales/bn_BD (LC_IDENTIFICATION): Change language name in
“title” and “language” from Bengali to Bangla.
* locales/bn_IN (LC_IDENTIFICATION): Likewise.
The custom stuff which was in LC_CTYPE of the km_KH locale seems
to be a very incomplete subset of what one gets by using
“copy "i18n"”. I cannot find anything special there which is not
in “copy "i18n"”, only lots of stuff which is missing.
[BZ #20008]
* locales/km_KH (LC_CTYPE): Use “copy "i18n"”.
[BZ #20482]
* locales/de_AT (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_BE (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_CH (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_DE (LC_TIME): Use readable ASCII in abday.
* locales/de_IT (LC_TIME): Use readable ASCII in abday.
* locales/de_LU (LC_TIME): Use 2 letter abbreviations in abday.
See also [BZ #20756].
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space,
typically the width of a thin space or a mid space.
U+2009 THIN SPACE.
Many languages use small gap as thousands separator.
Thousands separator should not be a plain space, but a narrow space.
And additionally, it is not allowed to wrap line in the middle of the
number.
Locale data were created in a deep age of 8-bit encodings, so most of
them use space (incorrect: it allows wrapping the line in the middle
of the number), or NBSP (better, but typographically incorrect: space
between groups is too wide).
Now UNICODE is widely supported, so we should leave legacy characters
in favor of correct UNICODE character.
UNICODE has a dedicated character for this purpose:
NNBSP
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space,
typically the width of a thin space or a mid space
The NNBSP exists since Unicode 3.0.
Use of NNBSP will prevent line wrapping in the midle of number and
improve readability of numbers.
[BZ #20756]
* locales/aa_DJ (LC_MONETARY): Replace space by NNBSP as thousands separator.
* locales/az_AZ (LC_MONETARY): Likewise.
* locales/be_BY (LC_MONETARY): Likewise.
* locales/be_BY@latin (LC_MONETARY): Likewise.
* locales/bg_BG (LC_MONETARY): Likewise.
* locales/bs_BA (LC_MONETARY): Likewise.
* locales/ce_RU (LC_MONETARY): Likewise.
* locales/crh_UA (LC_MONETARY): Likewise.
* locales/cs_CZ (LC_MONETARY): Likewise.
* locales/cs_CZ (LC_NUMERIC): Likewise.
* locales/cv_RU (LC_MONETARY): Likewise.
* locales/de_AT (LC_MONETARY): Likewise.
* locales/eo (LC_MONETARY): Likewise.
* locales/es_CR (LC_MONETARY): Likewise.
* locales/es_CR (LC_NUMERIC): Likewise.
* locales/es_CU (LC_MONETARY): Likewise.
* locales/et_EE (LC_MONETARY): Likewise.
* locales/et_EE (LC_NUMERIC): Likewise.
* locales/fi_FI (LC_MONETARY): Likewise.
* locales/fi_FI (LC_NUMERIC): Likewise.
* locales/fr_CA (LC_MONETARY): Likewise.
* locales/fr_FR (LC_MONETARY): Likewise.
* locales/fr_FR (LC_NUMERIC): Likewise.
* locales/fr_LU (LC_MONETARY): Likewise.
* locales/fr_LU (LC_NUMERIC): Likewise.
* locales/hr_HR (LC_MONETARY): Likewise.
* locales/ht_HT (LC_NUMERIC): Likewise.
* locales/kk_KZ (LC_MONETARY): Likewise.
* locales/kk_KZ (LC_NUMERIC): Likewise.
* locales/ky_KG (LC_MONETARY): Likewise.
* locales/ky_KG (LC_NUMERIC): Likewise.
* locales/lv_LV (LC_MONETARY): Likewise.
* locales/lv_LV (LC_NUMERIC): Likewise.
* locales/mg_MG (LC_MONETARY): Likewise.
* locales/mhr_RU (LC_MONETARY): Likewise.
* locales/mk_MK (LC_MONETARY): Likewise.
* locales/mk_MK (LC_NUMERIC): Likewise.
* locales/mn_MN (LC_MONETARY): Likewise.
* locales/nb_NO (LC_MONETARY): Likewise.
* locales/nb_NO (LC_NUMERIC): Likewise.
* locales/nl_AW (LC_MONETARY): Likewise.
* locales/nl_NL (LC_MONETARY): Likewise.
* locales/nn_NO (LC_MONETARY): Likewise.
* locales/os_RU (LC_MONETARY): Likewise.
* locales/pap_AW (LC_MONETARY): Likewise.
* locales/pap_CW (LC_MONETARY): Likewise.
* locales/ru_RU (LC_MONETARY): Likewise.
* locales/ru_RU (LC_NUMERIC): Likewise.
* locales/ru_UA (LC_MONETARY): Likewise.
* locales/sk_SK (LC_MONETARY): Likewise.
* locales/sk_SK (LC_NUMERIC): Likewise.
* locales/sl_SI (LC_MONETARY): Likewise.
* locales/sl_SI (LC_NUMERIC): Likewise.
* locales/sq_MK (LC_MONETARY): Likewise.
* locales/sv_SE (LC_MONETARY): Likewise.
* locales/sv_SE (LC_NUMERIC): Likewise.
* locales/tg_TJ (LC_MONETARY): Likewise.
* locales/tt_RU (LC_MONETARY): Likewise.
* locales/tt_RU@iqtelif (LC_MONETARY): Likewise.
* locales/uk_UA (LC_MONETARY): Likewise.
* locales/uk_UA (LC_NUMERIC): Likewise.
* locales/unm_US (LC_MONETARY): Likewise.
* locales/unm_US (LC_NUMERIC): Likewise.
* locales/wo_SN (LC_MONETARY): Likewise.
[BZ #17563]
[BZ #16905]
* locales/cmn_TW (LC_COLLATE): Use cns11643_stroke file for sorting.
* locales/cmn_TW (LC_TIME): Improve time and date formats.
* locales/cmn_TW (LC_MESSAGES): Add yesstr and nostr.
* locales/cns11643_stroke: New file, stroke count collation for
traditional Chinese.
These comments are useless and only confusing. The encodings used to
create binary locales from source locales are listed in the
localedata/SUPPORTED file. The source files itself are ASCII or UTF-8
encoded where non-ASCII UTF-8 is currently only used in comments. If
all locale source files are UTF-8 anyway, there is no need to specify
that in a special comment.
New locale is added for the Seychelles which is a member of the African
Union. English is an offical language for the Seychelles.
[BZ #21854]
* locales/en_SC: New file.
* localedata/SUPPORTED : Add en_SC/UTF-8.
For the locales doi_IN, kok_IN, and sat_IN, the words for
“yes” and “no” were apparently in yesexpr and noexpr.
Copy them from there to add yesstr and nostr.
Also make yesexpr and noexpr more readable by using
the POSIX portable character set.
* locales/doi_IN (LC_MESSAGES): Add yesstr and nostr.
* locales/kok_IN (LC_MESSAGES): Add yesstr and nostr.
* locales/sat_IN (LC_MESSAGES): Add yesstr and nostr.
This reverts commit 8f75515080
Revert “Fix yesexpr in en_DK locale”.
* locales/en_DK (LC_MESSAGES): Restore original yesexpr, noexpr,
yesstr, nostr. Convert them to ASCII and add a comment why
we want to have them like this.
And make the expressions more readable by using the POSIX portable character set
instead of Unicode code points.
* locales/agr_PE (LC_MESSAGES): drop .* from yesexpr and noexpr
* locales/az_IR (LC_MESSAGES): Improve yesexpr and noexpr.
* locales/az_IR (LC_ADDRESS): Fix typo in comment and
use the individual iso-639-3 code for South Azerbaijani
"azb" in lang_term.
* locales/az_IR (LC_NAME): Improve readability of name_fmt in source.
After the recent import of month names from CLDRv31 (bug 21217,
commit c853f14) an import of abbreviated month names is also needed
to make sure they match the full forms.
In case of kok_IN CLDR does not provide the abbreviated month names
explicitly but uses full month names in such cases so abmon section
has been copied from mon.
* localedata/locales/as_IN (abmon): Update from CLDR.
* localedata/locales/bn_BD (abmon): Likewise.
* localedata/locales/bn_IN (abmon): Likewise.
* localedata/locales/gu_IN (abmon): Likewise.
* localedata/locales/hi_IN (abmon): Likewise.
* localedata/locales/kn_IN (abmon): Likewise.
* localedata/locales/ml_IN (abmon): Likewise.
* localedata/locales/mr_IN (abmon): Likewise.
* localedata/locales/ne_NP (abmon): Likewise.
* localedata/locales/or_IN (abmon): Likewise.
* localedata/locales/pa_IN (abmon): Likewise.
* localedata/locales/ta_IN (abmon): Likewise.
* localedata/locales/te_IN (abmon): Likewise.
* localedata/locales/kok_IN (abmon): Likewise but copied from mon.
Maithili which is an official language not only in India but in Nepal as well.
https://en.wikipedia.org/wiki/Maithili_language
Reference is taken form mai_IN.
[BZ #21835]
* localedata/locales/mai_NP: New file.
* localedata/SUPPORTED: Add mai_NP/UTF-8.
After the recent update of int_select the comment needed an update, too.
While at this, all comments in LC_TELEPHONE were moved above their
respective values because this looks better. Some minor typos fixed.
[BZ #21783]
* localedata/locales/lg_UG (LC_TELEPHONE): Move all comments
above the values, correct some of them.
The commit to add the Fiji Hindi locale mentioned
Bug 21207 - ce_RU: update weekdays from CLDR
which was wrong, correct is:
Bug 21694 - Current Glibc Locale Does Not Support Tok-Pisin and Fiji Hindi Locale
During Hindi Locale review I found many fields are incorrect
[BZ #21729]
* locales/hi_IN (LC_NAME): Fix name_mr, name_mrs, name_miss, name_ms
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
During Locale verification I observed that
yesstr and nostr are missing for Xhosa language locale
for South Africa
[BZ #21724]
* locales/xh_ZA (LC_MESSAGES): add yesstr and nostr
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
During Locale verification I observed that
Incorrect Full Weekday names for ks_IN@devanagari
Reference is taken from
http://www.mkraina.com/PDF/3-Self-authored%20Works%20(English)/15.pdf
And kashmiri devanagari travel book and other sources
[BZ #21721]
* locales/ks_IN@devanagari: Full weekday name Fix.
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
After the recent import of month names from CLDRv31 (bug 21217,
commit c853f14) more imports are also needed, mostly abbreviated month
names.
This patch also updates May (full month name) in ps_AF which was
skipped in the previous patch.
Incidentally, this import fixes bug 17225 (ar_SY) and partially
bug 19066 (ar_SA).
CLDR currently has a bug in the full month name for October for ar_IQ, see
http://unicode.org/cldr/trac/ticket/10460
* localedata/locales/ar_DZ (abmon): Full import from CLDR, abmon
is no longer abbreviated.
* localedata/locales/ar_IQ (abmon): Likewise.
* localedata/locales/ar_MA (abmon): Likewise.
* localedata/locales/ar_TN (abmon): Likewise.
* localedata/locales/ps_AF (abmon): Likewise.
* localedata/locales/ug_CN (abmon): Likewise.
* localedata/locales/ar_SA (abmon): Likewise, partially
fixes bug 19066.
* localedata/locales/ks_IN (abmon): A copy of mon.
* localedata/locales/ur_IN (abmon): Oct reworded "اكتوبر" to
"اکتوبر" (same change as mon).
* localedata/locales/ur_PK (abmon): Same changes as mon applied.
* localedata/locales/ps_AF (mon): May reworded "می" to "مۍ".
[BZ #17225]
* localedata/locales/ar_SY (abmon): May reworded "نوار" to
"أيار", this closes bug 17225.
* localedata/locales/ar_JO (abmon): Likewise.
* localedata/locales/ar_LB (abmon): Likewise.
[BZ #21711]
During Locale verification I observed that
yesstr and nostr are missing for Pashto [LC_MESSAGES] Locale
For Afghanistan reference google translate and Pashto travel book.
[BZ #21706]
During Locale verification i observed that
yesstr and nostr are missing for Breton [LC_MESSAGES] locale
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
After the recent import of month names from CLDR (bug 21217) more
imports are also needed, mostly abbreviated month names.
* localedata/locales/br_FR (abmon): Reworded "Eve " to "Mezh".
* localedata/locales/fy_NL (abmon): Reworded "Maa" (March) to
"Mrt" and "Maa" (May) to "Mai".
* localedata/locales/lg_UG (abmon): Reworded "Jun" to "Juu".
* localedata/locales/ln_CD (abmon): "yan", "fbl", "msi",
and so on.
* localedata/locales/mn_MN (abmon): "1-р сар", "2-р сар",
"3-р сар", and so on.
* localedata/locales/vi_VN (abmon): Reworded "Th01" to "Thg 1",
"Th02" to "Thg 2" and so on.
* localedata/locales/yo_NG (abday): "Àìkú", "Ajé", "Ìsẹ́gun",
and so on, also comment updated to match the new content.
(day): "Ọjọ́ Àìkú", "Ọjọ́ Ajé", "Ọjọ́ Ìsẹ́gun", and so on.
(abmon): "Ṣẹ́rẹ́", "Èrèlè", "Ẹrẹ̀nà", and so on.
(mon): Comment updated to match the actual content.
(d_t_fmt): Changed "%A" to "%a" and "%B" to "%b".
* localedata/locales/zu_ZA (abmon): "Jan", "Feb", "Mas",
and so on, also comment updated to match the new content.
(mon): comment updated to match the actual content.
This updates a bunch of locales based on CLDR v29 data:
az_AZ: changing Azərbaycanca to azərbaycan dili
be_BY: changing беларуская мова to беларуская
bem_ZM: changing iciBemba to Ichibemba
bg_BG: changing български език to български
bo_CN: changing པོད་སྐད་ to བོད་སྐད་
bo_IN: changing པོད་སྐད་ to བོད་སྐད་
br_FR: changing Brezhoneg to brezhoneg
brx_IN: lang_name: setting to बड़ो
ce_RU: changing нохчийн мотт to нохчийн
cs_CZ: changing Čeština to čeština
dz_BT: changing (རྫོང་ཁ to རྫོང་ཁ
el_CY: changing ελληνικά to Ελληνικά
el_GR: changing ελληνικά to Ελληνικά
es_AR: changing Español to español
es_BO: changing Español to español
es_CL: changing Español to español
es_CO: changing Español to español
es_CR: changing Español to español
es_CU: changing Español to español
es_DO: changing Español to español
es_EC: changing Español to español
es_ES: changing Español to español
es_GT: changing Español to español
es_HN: changing Español to español
es_MX: changing Español to español
es_NI: changing Español to español
es_PA: changing Español to español
es_PE: changing Español to español
es_PR: changing Español to español
es_PY: changing Español to español
es_SV: changing Español to español
es_US: changing Español to español
es_UY: changing Español to español
es_VE: changing Español to español
et_EE: changing eesti keel to eesti
eu_ES: changing Euskara to euskara
fr_BE: changing Français to français
fr_CA: changing Français to français
fr_CH: changing Français to français
fr_FR: changing Français to français
fr_LU: changing Français to français
fur_IT: changing Furlan to furlan
fy_NL: changing Frysk to West-Frysk
gl_ES: changing Galego to galego
gv_GB: changing y Ghaelg to Gaelg
he_IL: lang_name: setting to עברית
hsb_DE: changing Hornjoserbšćina to hornjoserbšćina
hy_AM: changing Հայերեն to հայերեն
id_ID: changing Bahasa Indonesia to Bahasa Indonesia
it_CH: changing Italiano to italiano
it_IT: changing Italiano to italiano
kl_GL: changing Kalaallisut to kalaallisut
km_KH: changing ភាសាខ្មែរ to ខ្មែរ
ko_KR: changing 한국말 to 한국어
ks_IN: changing kạ̄šur to کٲشُر
kw_GB: changing Kernowek to kernewek
ky_KG: changing Кыргызча to кыргызча
lg_UG: changing Oluganda to Luganda
lt_LT: changing lietuvių kalba to lietuvių
lv_LV: changing latviešu valoda to latviešu
mk_MK: changing македонск/и јазик to македонски
mn_MN: changing Монгол хэл to монгол
nb_NO: changing Bokmål to norsk bokmål
nn_NO: changing Nynorsk to nynorsk
os_RU: lang_name: setting to ирон
ru_RU: lang_name: setting to русский
ru_UA: lang_name: setting to русский
se_NO: changing Davvisámegiella to davvisámegiella
sk_SK: lang_name: setting to slovenčina
ta_IN: lang_name: setting to தமிழ்
ta_LK: lang_name: setting to தமிழ்
tk_TM: changing Türkmençe to türkmençe
tr_CY: changing Turkish to Türkçe
tr_TR: changing Turkish to Türkçe
ur_IN: lang_name: setting to {اردو}
ur_PK: lang_name: setting to {اردو}
vi_VN: changing Việt ngữ to Tiếng Việt
yo_NG: changing Yorùbá to Èdè Yorùbá
zu_ZA: changing IsiZulu to isiZulu
Most of these are simple case changes, but they match the CLDR db.
A search for a few of the others suggests they're also correct.
[BZ #21217]
* localedata/locales/ln_CD (mon): Months imported from CLDR, all changed:
"Yanwáli" to "sánzá ya yambo", "Febwáli" to "sánzá ya míbalé" and so on.
* locales/mn_MN (mon): reworded "Хулгана сарын" to "Нэгдүгээр сар",
"Үхэр сарын" to "Хоёрдугаар сар", and so on.
* locales/vi_VN (mon): reworded "Tháng một" to "Tháng 1", "Tháng hai"
to "Tháng 2", "Tháng ba" to "Tháng 3", and so on.
* locales/yo_NG (mon): reworded "Jánúárì" to "Oṣù Ṣẹ́rẹ́", "Fẹ́búárì"
to "Oṣù Èrèlè", "Máàṣì" to "Oṣù Ẹrẹ̀nà", and so on.
* locales/zu_ZA (mon): reworded "uMasingana" to "Januwari",
"uNhlolanja" to "Februwari", "uNdasa" to "Mashi", and so on.
[BZ #21217]
* localedata/locales/be_BY (mon, abmon): Reworded "Травень" ("travyen'")
and abbreviated "Тра" ("tra") to "Май" ("may").
* localedata/locales/be_BY@latin (mon, abmon): Likewise, "Travień"
and abbreviated "Tra" reworded to "Maj".
* localedata/locales/br_FR (day, abmon, mon, d_t_fmt): Use the proper
Unicode apostrophe (Cʼhwevrer, mercʼher, Dʼar).
* localedata/locales/es_PE (mon, abmon): Reworded "septiembre" to
"setiembre" and abbreviated "sep" to "set".
* localedata/locales/es_UY: Likewise.
* localedata/locales/fil_PH (mon): Reworded "Septiyembre" to "Setyembre"
and "Nobiyembre" to "Nobyembre".
* localedata/locales/fur_IT (mon, abmon): Reworded "Decembar" to
"Dicembar" and abbreviated "Dec" to "Dic".
* localedata/locales/fy_NL (mon): Reworded "Janaris" to "Jannewaris".
* localedata/locales/ha_NG (mon, abmon): Reworded "Fabrairu" to
"Faburairu" and "Afrilu" to "Afirilu", also abbreviated "Afr" to "Afi".
* localedata/locales/ig_NG (mon, abmon): All months begin with
uppercase. Reworded: "febụrụwarị" to "Febrụwarị", "epreel" to "Eprel",
"ọgọstụ" to "Ọgọọst", "nọvemba" to "Novemba", "nọv" to "Nov".
* localedata/locales/kw_GB (mon): Months imported from CLDR, many
changes: all "Mys" to "mis", "Mys Whevrel" to "mis Hwevrer", "Mys Merth"
to "mis Meurth", "Mys Evan" to "mis Metheven", and so on.
(abmon): "Whe>" to "Hwe", "Mer" to "Meu", "Evn" to "Met".
* localedata/locales/lg_UG (mon): Reworded "Julaai" to "Julaayi".
* localedata/locales/ln_CD (mon): Months imported from CLDR, all changed:
"Yanwáli" to "sánzá ya yambo", "Febwáli" to "sánzá ya míbalé" and so on.
* localedata/locales/mg_MG (mon, abmon): All months now begin with
uppercase.
* localedata/locales/se_NO (mon): Months imported from CLDR, all
suffixes changed from "mánu" to "mánnu".
* localedata/locales/sr_RS@latin (mon): Reworded "juni" to "jun" and
"juli" to "jul".
* localedata/locales/uz_UZ (mon): Reworded "Sentyabr" to "Sentabr"
and "Oktyabr" to "Oktabr".
* localedata/locales/uz_UZ@cyrillic (mon): Most of the names reworded,
no longer end with a soft sign.
* localedata/locales/wae_CH (mon): reworded "Bráchet" to "Bráčet",
"Öigschte" to "Öigšte", "Herbschtmánet" to "Herbštmánet",
"Chrischtmánet" to "Chrištmánet".
* Unicode 10.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 10.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
<locale.h> is specified to define locale_t in POSIX.1-2008, and so are
all of the headers that define functions that take locale_t arguments.
Under _GNU_SOURCE, the additional headers that define such functions
have also always defined locale_t. Therefore, there is no need to use
__locale_t in public function prototypes, nor in any internal code.
* ctype/ctype-c99_l.c, ctype/ctype.h, ctype/ctype_l.c
* include/monetary.h, include/stdlib.h, include/time.h
* include/wchar.h, locale/duplocale.c, locale/freelocale.c
* locale/global-locale.c, locale/langinfo.h, locale/locale.h
* locale/localeinfo.h, locale/newlocale.c
* locale/nl_langinfo_l.c, locale/uselocale.c
* localedata/bug-usesetlocale.c, localedata/tst-xlocale2.c
* stdio-common/vfscanf.c, stdlib/monetary.h, stdlib/stdlib.h
* stdlib/strfmon_l.c, stdlib/strtod_l.c, stdlib/strtof_l.c
* stdlib/strtol.c, stdlib/strtol_l.c, stdlib/strtold_l.c
* stdlib/strtoll_l.c, stdlib/strtoul_l.c, stdlib/strtoull_l.c
* string/strcasecmp.c, string/strcoll_l.c, string/string.h
* string/strings.h, string/strncase.c, string/strxfrm_l.c
* sysdeps/ieee754/float128/strtof128_l.c
* sysdeps/ieee754/float128/wcstof128.c
* sysdeps/ieee754/float128/wcstof128_l.c
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c
* sysdeps/ieee754/ldbl-64-128/strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
* sysdeps/ieee754/ldbl-opt/nldbl-strfmon_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-wcstold_l.c
* sysdeps/powerpc/powerpc32/power7/strcasecmp.S
* sysdeps/powerpc/powerpc64/power7/strcasecmp.S
* sysdeps/x86_64/strcasecmp_l-nonascii.c
* sysdeps/x86_64/strncase_l-nonascii.c, time/strftime_l.c
* time/strptime_l.c, time/time.h, wcsmbs/mbsrtowcs_l.c
* wcsmbs/wchar.h, wcsmbs/wcscasecmp.c, wcsmbs/wcsncase.c
* wcsmbs/wcstod.c, wcsmbs/wcstod_l.c, wcsmbs/wcstof.c
* wcsmbs/wcstof_l.c, wcsmbs/wcstol_l.c, wcsmbs/wcstold.c
* wcsmbs/wcstold_l.c, wcsmbs/wcstoll_l.c, wcsmbs/wcstoul_l.c
* wcsmbs/wcstoull_l.c, wctype/iswctype_l.c
* wctype/towctrans_l.c, wctype/wcfuncs_l.c
* wctype/wctrans_l.c, wctype/wctype.h, wctype/wctype_l.c:
Change all uses of __locale_t to locale_t.
[BZ #21207]
* locales/ce_RU (day): Updated (imported) from CLDR. Uppercase letters
left unchanged.
* locales/ce_RU (abday): Minor updates to match (day): Latin uppercase
"I" replaced with Cyrillic "Ӏ" ("Palochka", Unicode: U04C0). Trailing
spaces removed.
Despite the fact that el_GR is ISO-8859-7:2003 which contains the euro
symobl, it is not possible to know this apriori to selecting the el_GR
locale. Therefore you don't know if el_GR can possibly have the 2003
ammendments which include the euro symbol. This is resolved by creating
an el_GR@euro locale similar to all the other @euro locales for non-UTF8
charsets.
Fix the incorrect sorting order of a digraph and its geminated variant,
regression introduced by a faulty fix to bug 13547 in commit
b008d4c856.
Fix two inconsistencies in sorting unusual capitalization of digraphs
(bug #18587).
Enable DIACRIT_FORWARD to work around bug #17750.
Sort foreign accents after the Hungarian ones.
Add extensive unittests containing all the examples from The Rules of
Hungarian Orthography and many more, including explanatory comments.
* Unicode 9.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 9.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
Both regexes end with a "*." which means the previous match can be
omitted, and then the . allows them to match any input at all.
This means tools like coreutils' `rm -i` will always delete things
when prompted because the yesexpr regex matches all inputs (even
the negative ones).
Microsoft long ago added a mapping for 0x80 to the Euro sign to their
CP936. While GBK 1.0 doesn't include this mapping, it is compatible,
and Microsoft and glibc alias the two codepages. We could split them
apart so GBK wouldn't include the mapping, but that seems like a lot
of work for little gain.
The standard currently in effect (LST ISO 8601:1997) mandates the use
of hyphens (as opposed to full stops, currently) in date formats. It
also matches current CLDR data (v29), Wikipedia's & Wikia's settings,
and Microsoft's Lithuanian Style Guide.
According to "Requirements of information technology in Estonian
language and cultural environment" the monetary symbol should be
written after the amount number:
https://www.evs.ee/products/evs-8-2008