From localedef --help:
Output control:
...
--no-warnings=<warnings> Comma-separated list of warnings to disable;
supported warnings are: ascii, intcurrsym
...
--warnings=<warnings> Comma-separated list of warnings to enable;
supported warnings are: ascii, intcurrsym
Locales using SHIFT_JIS and SHIFT_JISX0213 character maps are not ASCII
compatible. In order to build locales using these character maps, and
have localedef exit with a status of 0, we add new option to localedef
to disable or enable specific warnings. The options are --no-warnings
and --warnings, to disable and enable specific warnings respectively.
The options take a comma-separated list of warning names. The warning
names are taken directly from the generated warning. When a warning
that can be disabled is issued it will print something like this: foo is
not defined [--no-warnings=foo]
For the initial implementation we add two controllable warnings; first
'ascii' which is used by the localedata installation makefile target to
install SHIFT_JIS and SHIFT_JISX0213-using locales without error; second
'intcurrsym' which allows a program to use a non-standard international
currency symbol without triggering a warning. The 'intcurrsym' is
useful in the future if country codes are added that are not in our
current ISO 4217 list, and the user wants to avoid the warning. Having
at least two warnings to control gives an example for how the changes
can be extended to more warnings if required in the future.
These changes allow ja_JP.SHIFT_JIS and ja_JP.SHIFT_JISX0213 to be
compiled without warnings using --no-warnings=ascii. The
localedata/Makefile $(INSTALL-SUPPORTED-LOCALES) target is adjusted to
automatically add `--no-warnings=ascii` for such charmaps, and likewise
localedata/gen-locale.sh is adjusted with similar logic.
v2: Bring verbose, be_quiet, and all warning control booleans into
record-status.c, and compile this object file to be used by locale,
iconv, and localedef. Any users include record-status.h.
v3: Fix an instance of boolean coercion in set_warning().
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
The localedata collation test data is encoded in a particular
character set. We rename the test data to match the full locale
name with encoding, and adjust the Makefile and sort-test.sh
script. This allows us to have a future C.UTF-8 test that is
disambiguated from the built-in C locale.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
After the transition to generating a distinct file for Unicode ctype
information e.g. i18n_ctype, the check target was left with the wrong
target name. This patch fixes the check target and regenerates the
files with more information than previously used, filling in the the
LC_IDENTIFICATION data.
Tested on x86_64 by regenerating from Unicode source files, and
running checks. Tested by subsequently rebuilding all locales.
No regressions in testsuite.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Reported-by: Rafal Luzynski <digitalfreak@lingonborough.com>
* localedata/locales/hi_IN (LC_MESSAGES): In yesexpr and noexpr,
also check for the first characters of yesstr and nostr.
* localedata/locales/kn_IN (LC_MESSAGES): Likewise.
* localedata/locales/ks_IN@devanagari (LC_MESSAGES): Likewise.
* localedata/locales/chr_US (LC_MESSAGES): In yesexpr and noexpr,
match also for the contents of yesstr and nostr. As the first letter
of yesstr and nostr is equal, checking only for the first letter
is not enough.
* localedata/locales/ug_CN (LC_MESSAGES): Fix noexpr and yesexpr
by including the first letters of nostr and yesexpr in the regexp.
Also make it more readable by using ASCII where possible.
* localedata/locales/te_IN (LC_MESSAGES): Fix noexpr by including
the first letter of nostr in the regexp. It agrees with CLDR now.
Also make it more readable by using ASCII where possible.
* localedata/locales/km_KH (LC_MESSAGES): Fix yestr and nostr.
The yesstr and nostr apparently came from CLDR. And CLDR has a bug there:
these strings contain a U+17D6 (which somewhat looks like a colon)
instead of a real colon to separate the full words for “yes”
and “no” from the single letter responses.
* localedata/locales/ka_GE (LC_MESSAGES): Fix yesexp to make
it agree with CLDR (include the first letter of yesstr).
Also make it more readable by using ASCII where possible.
* localedata/locales/mr_IN (LC_MESSAGES): Fix yesstr and nostr
and improve yesexpr and noexpr. The yesstr and nostr apparently
came from CLDR. And CLDR has a bug there: these strings contain
a U+0903 (which looks like a colon) instead of a real colon
to separate the full words for “yes” and “no” from the single
letter responses.
Using all characters of the full words for yes and no in yesexpr and noexpr
makes no sense here, especially not because the words for yes and no
share one character.
* localedata/locales/bn_BD (LC_MESSAGES): Use only the first
letters of the full yesstr and nostr in yesexpr and noexpr.
* localedata/locales/an_ES (LC_MESSAGES): Add yesstr and nostr.
* localedata/locales/an_ES (LC_ADDRESS): Add lang_term and lang_lib.
* localedata/locales/an_ES: Make source more readable by using ASCII
where possible.
* localedata/locales/tpi_PG (LC_MESSAGES): Fix yesexpr and noexpr
by adding the generic +1 and -0 as in all other locales.
* localedata/locales/tpi_PG (LC_TIME): Fix some typos in the month and
day names and make it more readable by using ASCII where possible.
[BZ #16777]
* localedata/locales/pl_PL (LC_MONETARY): Use U+202F as mon_thousands_sep
and improve readability by using more ASCII.
* localedata/locales/pl_PL (LC_NUMERIC): Use U+202F as thousands_sep
and improve readability by using more ASCII.
The Valencian (meridional Catalan) locale is basically a copy of the
Catalan locale. The point of having a separate locale is only for PO
translations. This locale is already provided by several distributions
and is already supported by various projects like LibreOffice, Mozilla,
Gnome, KDE.
Aurelien Jarno <aurelien@aurel32.net>
[BZ #2522]
* localedata/locales/ca_ES@valencia: New file.
* localedata/SUPPORTED: Add ca_ES@valencia/UTF-8.
CLDR uses this pattern as well.
[BZ #22019]
* localedata/locales/el_GR: Set n_cs_precedes to 0.
* localedata/locales/el_CY: copy "el_GR" because it is identical.
* stdlib/tst-strfmon_l.c: adapt test case.
The error and warning handling in localedef, locale, and iconv
is a bit of a mess.
We use ugly constructs like this:
WITH_CUR_LOCALE (error (1, errno, gettext ("\
cannot read character map directory `%s'"), directory));
to issue errors, and read error_message_count directly from the
error API to detect errors. The problem with that is that the
code also uses error to print warnings, and informative messages.
All of this leads to problems where just having warnings will
produce an exit status as-if errors had been seen.
To fix this situation I have adopted the following high-level
changes:
* All errors are counted distinctly.
* All warnings are counted distinctly.
* All informative messages are not counted.
* Increasing verbosity cannot generate *more* errors, and
it previously did for errors conditional on verbose,
this is now fixed.
* Increasing verbosity *can* generate *more* warnings.
* Making the output quiet cannot generate *fewer* errors,
and it previously did for errors conditional on be_quiet,
this is now fixed.
* Each of error, warning, and informative message has it's
own function to call defined in record-status.h, and they
are: record_error, record_warning, and record_verbose.
* The record_error function always records an error, but
conditional on be_quiet may not print it.
* The record_warning function always records a warning,
but conditional on be_quiet may not print it.
* The record_verbose function only prints the verbose
message if verbose is true and be_quiet is false.
This has allowed the following fix:
* Previously any warnings were being treated as errors
because they incremented error_message_count, but now
we properly return an exit status of 1 if there are
warnings but output was generated.
All of this allows localedef to correctly decide if errors,
or warnings were present, and produce the correct exit code.
The locale and iconv programs now also use record-status.h
and we have removed the WITH_CUR_LOCALE hack, and instead
have internal push_locale/pop_locale functions centralized
in the record routines.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
The commit does the following things:
* Move non-transliteration Unicode generated data to i18n_ctype.
* Copy the i18n_ctype data into i18n and add transliteration.
In the future, any locale which needs Unicode LC_CTYPE data can
also just use `copy i18n_ctype` and get the base character classes
and maps without transliteration.
Tested by compiling all the locales and my prototype C.UTF-8 which
uses it.
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
This code page is identical to code page 850 except that X'D5'
has been changed from LI61 (dotless i) to SC20 (euro symbol).
The code points from /x01 to /x1f in the /localedata/charmaps/IBM858
file have the same mapping as those in localedata/charmaps/ANSI_X3.4-1968.
That means they disagree with with
ftp://ftp.software.ibm.com/software/globalization/gcoc/attachments/CP00858.txt
in that range.
For example, localedata/charmaps/IBM858 and localedata/charmaps/ANSI_X3.4-1968 have:
“<U0001> /x01 START OF HEADING (SOH)”
whereas CP00858.txt has:
“01 SS000000 Smiling Face”
That means that CP00858.txt is not really ASCII-compatible and to make
it ASCII-compatible we deviate fro CP00858.txt in the code points from /x01
to /x1f.
[BZ #21084]
* benchtests/strcoll-inputs/filelist#en_US.UTF-8: Add IBM858 and ibm858.c.
* iconvdata/Makefile: Add IBM858.
* iconvdata/gconv-modules: Add IBM858.
* iconvdata/ibm858.c: New file.
* iconvdata/tst-tables.sh: Add IBM858
* localedata/charmaps/IBM858: New file.
“Bengali” still remained in some comments in the bn_BD locale file,
in iso-639.def and in a test input file. Change it there as well.
“Bangla” is now used as the English name for this language in CLDR.
[BZ #14925]
* libio/tst-widetext.input: Change “Bengali” to “Bangla”.
* locale/iso-639.def: Change “Bengali” to “Bangla”.
* localedata/locales/bn_BD: “Bengali” was still used in some
comments. Change it to “Bangla”.
[BZ #22070]
* localedata/unicode-gen/utf8_gen.py: Set the width for
characters with Prepended_Concatenation_Mark property to 1
* localedata/charmaps/UTF-8: Updated using the improved script.
Writing ranges of neighbouring characters with the same with like this
<U000E0100>...<U000E01EF> 0
in charmaps/UTF-8 is more efficient than writing many single character lines
like:
<U000E0100> 0
<U000E0101> 0
...
[BZ #21750]
* unicode-gen/utf8_gen.py: Write all ranges of neighbouring characters
with the same width using the range notation in charmaps/UTF-8.
[BZ #15332]
* locales/es_CU (LC_MONETARY): use “,” for mon_decimal_point
and “.” for mon_thousands_sep (to agree with CLDR)
* locales/es_CU (LC_NUMERIC): Likewise.
[BZ #22038]
* locales/so_DJ (LC_TIME): Fix abday, abmon and
make t_fmt in the comment agree with the value of t_fmt.
* locales/so_ET (LC_TIME): Fix abday (From Axa to Axd)
* locales/so_KE (LC_TIME): Fix abday (From Axa to Axd)
* locales/so_SO (LC_TIME): Fix abday (From Axa to Axd)
[BZ #13805]
* locales/ru_RU (LC_MONETARY): Use “,” for mon_decimal_point
(to agree with CLDR).
* locales/ru_RU (LC_NUMERIC): Write mon_decimal_point in ASCII
for readability.
* locales/os_RU (LC_MONETARY): Copy from ru_RU,
makes it agree with CLDR.
Add locale for “Morisyen” which is also called “Mauritian Creole”
and is spoken in Mauritius.
[BZ #21971]
* localedata/SUPPORTED: Add mfe_MU/UTF-8.
* localedata/locales/mfe_MU: New File.
[BZ #21971]
* locale/iso-639.def: add Morisyen.
[BZ #21750]
* unicode-gen/utf8_gen.py (U+00AD): Set width to 1.
* unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0.
* unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2.
* unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise.
[BZ #19852]
[BZ #21750]
* unicode-gen/utf8_gen.py: Process EastAsianWidth lines before
UnicodeData lines so the latter have precedence; remove hack
to group output by EastAsianWidth ranges.
[BZ #14925]
* locales/bn_BD (LC_IDENTIFICATION): Change language name in
“title” and “language” from Bengali to Bangla.
* locales/bn_IN (LC_IDENTIFICATION): Likewise.
The custom stuff which was in LC_CTYPE of the km_KH locale seems
to be a very incomplete subset of what one gets by using
“copy "i18n"”. I cannot find anything special there which is not
in “copy "i18n"”, only lots of stuff which is missing.
[BZ #20008]
* locales/km_KH (LC_CTYPE): Use “copy "i18n"”.
[BZ #20482]
* locales/de_AT (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_BE (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_CH (LC_TIME): Use 2 letter abbreviations in abday.
* locales/de_DE (LC_TIME): Use readable ASCII in abday.
* locales/de_IT (LC_TIME): Use readable ASCII in abday.
* locales/de_LU (LC_TIME): Use 2 letter abbreviations in abday.
See also [BZ #20756].
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space,
typically the width of a thin space or a mid space.
U+2009 THIN SPACE.
Many languages use small gap as thousands separator.
Thousands separator should not be a plain space, but a narrow space.
And additionally, it is not allowed to wrap line in the middle of the
number.
Locale data were created in a deep age of 8-bit encodings, so most of
them use space (incorrect: it allows wrapping the line in the middle
of the number), or NBSP (better, but typographically incorrect: space
between groups is too wide).
Now UNICODE is widely supported, so we should leave legacy characters
in favor of correct UNICODE character.
UNICODE has a dedicated character for this purpose:
NNBSP
U+202F NARROW NO-BREAK SPACE: a narrow form of a no-break space,
typically the width of a thin space or a mid space
The NNBSP exists since Unicode 3.0.
Use of NNBSP will prevent line wrapping in the midle of number and
improve readability of numbers.
[BZ #20756]
* locales/aa_DJ (LC_MONETARY): Replace space by NNBSP as thousands separator.
* locales/az_AZ (LC_MONETARY): Likewise.
* locales/be_BY (LC_MONETARY): Likewise.
* locales/be_BY@latin (LC_MONETARY): Likewise.
* locales/bg_BG (LC_MONETARY): Likewise.
* locales/bs_BA (LC_MONETARY): Likewise.
* locales/ce_RU (LC_MONETARY): Likewise.
* locales/crh_UA (LC_MONETARY): Likewise.
* locales/cs_CZ (LC_MONETARY): Likewise.
* locales/cs_CZ (LC_NUMERIC): Likewise.
* locales/cv_RU (LC_MONETARY): Likewise.
* locales/de_AT (LC_MONETARY): Likewise.
* locales/eo (LC_MONETARY): Likewise.
* locales/es_CR (LC_MONETARY): Likewise.
* locales/es_CR (LC_NUMERIC): Likewise.
* locales/es_CU (LC_MONETARY): Likewise.
* locales/et_EE (LC_MONETARY): Likewise.
* locales/et_EE (LC_NUMERIC): Likewise.
* locales/fi_FI (LC_MONETARY): Likewise.
* locales/fi_FI (LC_NUMERIC): Likewise.
* locales/fr_CA (LC_MONETARY): Likewise.
* locales/fr_FR (LC_MONETARY): Likewise.
* locales/fr_FR (LC_NUMERIC): Likewise.
* locales/fr_LU (LC_MONETARY): Likewise.
* locales/fr_LU (LC_NUMERIC): Likewise.
* locales/hr_HR (LC_MONETARY): Likewise.
* locales/ht_HT (LC_NUMERIC): Likewise.
* locales/kk_KZ (LC_MONETARY): Likewise.
* locales/kk_KZ (LC_NUMERIC): Likewise.
* locales/ky_KG (LC_MONETARY): Likewise.
* locales/ky_KG (LC_NUMERIC): Likewise.
* locales/lv_LV (LC_MONETARY): Likewise.
* locales/lv_LV (LC_NUMERIC): Likewise.
* locales/mg_MG (LC_MONETARY): Likewise.
* locales/mhr_RU (LC_MONETARY): Likewise.
* locales/mk_MK (LC_MONETARY): Likewise.
* locales/mk_MK (LC_NUMERIC): Likewise.
* locales/mn_MN (LC_MONETARY): Likewise.
* locales/nb_NO (LC_MONETARY): Likewise.
* locales/nb_NO (LC_NUMERIC): Likewise.
* locales/nl_AW (LC_MONETARY): Likewise.
* locales/nl_NL (LC_MONETARY): Likewise.
* locales/nn_NO (LC_MONETARY): Likewise.
* locales/os_RU (LC_MONETARY): Likewise.
* locales/pap_AW (LC_MONETARY): Likewise.
* locales/pap_CW (LC_MONETARY): Likewise.
* locales/ru_RU (LC_MONETARY): Likewise.
* locales/ru_RU (LC_NUMERIC): Likewise.
* locales/ru_UA (LC_MONETARY): Likewise.
* locales/sk_SK (LC_MONETARY): Likewise.
* locales/sk_SK (LC_NUMERIC): Likewise.
* locales/sl_SI (LC_MONETARY): Likewise.
* locales/sl_SI (LC_NUMERIC): Likewise.
* locales/sq_MK (LC_MONETARY): Likewise.
* locales/sv_SE (LC_MONETARY): Likewise.
* locales/sv_SE (LC_NUMERIC): Likewise.
* locales/tg_TJ (LC_MONETARY): Likewise.
* locales/tt_RU (LC_MONETARY): Likewise.
* locales/tt_RU@iqtelif (LC_MONETARY): Likewise.
* locales/uk_UA (LC_MONETARY): Likewise.
* locales/uk_UA (LC_NUMERIC): Likewise.
* locales/unm_US (LC_MONETARY): Likewise.
* locales/unm_US (LC_NUMERIC): Likewise.
* locales/wo_SN (LC_MONETARY): Likewise.
[BZ #17563]
[BZ #16905]
* locales/cmn_TW (LC_COLLATE): Use cns11643_stroke file for sorting.
* locales/cmn_TW (LC_TIME): Improve time and date formats.
* locales/cmn_TW (LC_MESSAGES): Add yesstr and nostr.
* locales/cns11643_stroke: New file, stroke count collation for
traditional Chinese.
These comments are useless and only confusing. The encodings used to
create binary locales from source locales are listed in the
localedata/SUPPORTED file. The source files itself are ASCII or UTF-8
encoded where non-ASCII UTF-8 is currently only used in comments. If
all locale source files are UTF-8 anyway, there is no need to specify
that in a special comment.
New locale is added for the Seychelles which is a member of the African
Union. English is an offical language for the Seychelles.
[BZ #21854]
* locales/en_SC: New file.
* localedata/SUPPORTED : Add en_SC/UTF-8.
For the locales doi_IN, kok_IN, and sat_IN, the words for
“yes” and “no” were apparently in yesexpr and noexpr.
Copy them from there to add yesstr and nostr.
Also make yesexpr and noexpr more readable by using
the POSIX portable character set.
* locales/doi_IN (LC_MESSAGES): Add yesstr and nostr.
* locales/kok_IN (LC_MESSAGES): Add yesstr and nostr.
* locales/sat_IN (LC_MESSAGES): Add yesstr and nostr.
This reverts commit 8f75515080
Revert “Fix yesexpr in en_DK locale”.
* locales/en_DK (LC_MESSAGES): Restore original yesexpr, noexpr,
yesstr, nostr. Convert them to ASCII and add a comment why
we want to have them like this.
And make the expressions more readable by using the POSIX portable character set
instead of Unicode code points.
* locales/agr_PE (LC_MESSAGES): drop .* from yesexpr and noexpr
* locales/az_IR (LC_MESSAGES): Improve yesexpr and noexpr.
* locales/az_IR (LC_ADDRESS): Fix typo in comment and
use the individual iso-639-3 code for South Azerbaijani
"azb" in lang_term.
* locales/az_IR (LC_NAME): Improve readability of name_fmt in source.
After the recent import of month names from CLDRv31 (bug 21217,
commit c853f14) an import of abbreviated month names is also needed
to make sure they match the full forms.
In case of kok_IN CLDR does not provide the abbreviated month names
explicitly but uses full month names in such cases so abmon section
has been copied from mon.
* localedata/locales/as_IN (abmon): Update from CLDR.
* localedata/locales/bn_BD (abmon): Likewise.
* localedata/locales/bn_IN (abmon): Likewise.
* localedata/locales/gu_IN (abmon): Likewise.
* localedata/locales/hi_IN (abmon): Likewise.
* localedata/locales/kn_IN (abmon): Likewise.
* localedata/locales/ml_IN (abmon): Likewise.
* localedata/locales/mr_IN (abmon): Likewise.
* localedata/locales/ne_NP (abmon): Likewise.
* localedata/locales/or_IN (abmon): Likewise.
* localedata/locales/pa_IN (abmon): Likewise.
* localedata/locales/ta_IN (abmon): Likewise.
* localedata/locales/te_IN (abmon): Likewise.
* localedata/locales/kok_IN (abmon): Likewise but copied from mon.
Maithili which is an official language not only in India but in Nepal as well.
https://en.wikipedia.org/wiki/Maithili_language
Reference is taken form mai_IN.
[BZ #21835]
* localedata/locales/mai_NP: New file.
* localedata/SUPPORTED: Add mai_NP/UTF-8.
After the recent update of int_select the comment needed an update, too.
While at this, all comments in LC_TELEPHONE were moved above their
respective values because this looks better. Some minor typos fixed.
[BZ #21783]
* localedata/locales/lg_UG (LC_TELEPHONE): Move all comments
above the values, correct some of them.
The commit to add the Fiji Hindi locale mentioned
Bug 21207 - ce_RU: update weekdays from CLDR
which was wrong, correct is:
Bug 21694 - Current Glibc Locale Does Not Support Tok-Pisin and Fiji Hindi Locale
During Hindi Locale review I found many fields are incorrect
[BZ #21729]
* locales/hi_IN (LC_NAME): Fix name_mr, name_mrs, name_miss, name_ms
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
During Locale verification I observed that
yesstr and nostr are missing for Xhosa language locale
for South Africa
[BZ #21724]
* locales/xh_ZA (LC_MESSAGES): add yesstr and nostr
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
During Locale verification I observed that
Incorrect Full Weekday names for ks_IN@devanagari
Reference is taken from
http://www.mkraina.com/PDF/3-Self-authored%20Works%20(English)/15.pdf
And kashmiri devanagari travel book and other sources
[BZ #21721]
* locales/ks_IN@devanagari: Full weekday name Fix.
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
After the recent import of month names from CLDRv31 (bug 21217,
commit c853f14) more imports are also needed, mostly abbreviated month
names.
This patch also updates May (full month name) in ps_AF which was
skipped in the previous patch.
Incidentally, this import fixes bug 17225 (ar_SY) and partially
bug 19066 (ar_SA).
CLDR currently has a bug in the full month name for October for ar_IQ, see
http://unicode.org/cldr/trac/ticket/10460
* localedata/locales/ar_DZ (abmon): Full import from CLDR, abmon
is no longer abbreviated.
* localedata/locales/ar_IQ (abmon): Likewise.
* localedata/locales/ar_MA (abmon): Likewise.
* localedata/locales/ar_TN (abmon): Likewise.
* localedata/locales/ps_AF (abmon): Likewise.
* localedata/locales/ug_CN (abmon): Likewise.
* localedata/locales/ar_SA (abmon): Likewise, partially
fixes bug 19066.
* localedata/locales/ks_IN (abmon): A copy of mon.
* localedata/locales/ur_IN (abmon): Oct reworded "اكتوبر" to
"اکتوبر" (same change as mon).
* localedata/locales/ur_PK (abmon): Same changes as mon applied.
* localedata/locales/ps_AF (mon): May reworded "می" to "مۍ".
[BZ #17225]
* localedata/locales/ar_SY (abmon): May reworded "نوار" to
"أيار", this closes bug 17225.
* localedata/locales/ar_JO (abmon): Likewise.
* localedata/locales/ar_LB (abmon): Likewise.
[BZ #21711]
During Locale verification I observed that
yesstr and nostr are missing for Pashto [LC_MESSAGES] Locale
For Afghanistan reference google translate and Pashto travel book.
[BZ #21706]
During Locale verification i observed that
yesstr and nostr are missing for Breton [LC_MESSAGES] locale
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
After the recent import of month names from CLDR (bug 21217) more
imports are also needed, mostly abbreviated month names.
* localedata/locales/br_FR (abmon): Reworded "Eve " to "Mezh".
* localedata/locales/fy_NL (abmon): Reworded "Maa" (March) to
"Mrt" and "Maa" (May) to "Mai".
* localedata/locales/lg_UG (abmon): Reworded "Jun" to "Juu".
* localedata/locales/ln_CD (abmon): "yan", "fbl", "msi",
and so on.
* localedata/locales/mn_MN (abmon): "1-р сар", "2-р сар",
"3-р сар", and so on.
* localedata/locales/vi_VN (abmon): Reworded "Th01" to "Thg 1",
"Th02" to "Thg 2" and so on.
* localedata/locales/yo_NG (abday): "Àìkú", "Ajé", "Ìsẹ́gun",
and so on, also comment updated to match the new content.
(day): "Ọjọ́ Àìkú", "Ọjọ́ Ajé", "Ọjọ́ Ìsẹ́gun", and so on.
(abmon): "Ṣẹ́rẹ́", "Èrèlè", "Ẹrẹ̀nà", and so on.
(mon): Comment updated to match the actual content.
(d_t_fmt): Changed "%A" to "%a" and "%B" to "%b".
* localedata/locales/zu_ZA (abmon): "Jan", "Feb", "Mas",
and so on, also comment updated to match the new content.
(mon): comment updated to match the actual content.
This updates a bunch of locales based on CLDR v29 data:
az_AZ: changing Azərbaycanca to azərbaycan dili
be_BY: changing беларуская мова to беларуская
bem_ZM: changing iciBemba to Ichibemba
bg_BG: changing български език to български
bo_CN: changing པོད་སྐད་ to བོད་སྐད་
bo_IN: changing པོད་སྐད་ to བོད་སྐད་
br_FR: changing Brezhoneg to brezhoneg
brx_IN: lang_name: setting to बड़ो
ce_RU: changing нохчийн мотт to нохчийн
cs_CZ: changing Čeština to čeština
dz_BT: changing (རྫོང་ཁ to རྫོང་ཁ
el_CY: changing ελληνικά to Ελληνικά
el_GR: changing ελληνικά to Ελληνικά
es_AR: changing Español to español
es_BO: changing Español to español
es_CL: changing Español to español
es_CO: changing Español to español
es_CR: changing Español to español
es_CU: changing Español to español
es_DO: changing Español to español
es_EC: changing Español to español
es_ES: changing Español to español
es_GT: changing Español to español
es_HN: changing Español to español
es_MX: changing Español to español
es_NI: changing Español to español
es_PA: changing Español to español
es_PE: changing Español to español
es_PR: changing Español to español
es_PY: changing Español to español
es_SV: changing Español to español
es_US: changing Español to español
es_UY: changing Español to español
es_VE: changing Español to español
et_EE: changing eesti keel to eesti
eu_ES: changing Euskara to euskara
fr_BE: changing Français to français
fr_CA: changing Français to français
fr_CH: changing Français to français
fr_FR: changing Français to français
fr_LU: changing Français to français
fur_IT: changing Furlan to furlan
fy_NL: changing Frysk to West-Frysk
gl_ES: changing Galego to galego
gv_GB: changing y Ghaelg to Gaelg
he_IL: lang_name: setting to עברית
hsb_DE: changing Hornjoserbšćina to hornjoserbšćina
hy_AM: changing Հայերեն to հայերեն
id_ID: changing Bahasa Indonesia to Bahasa Indonesia
it_CH: changing Italiano to italiano
it_IT: changing Italiano to italiano
kl_GL: changing Kalaallisut to kalaallisut
km_KH: changing ភាសាខ្មែរ to ខ្មែរ
ko_KR: changing 한국말 to 한국어
ks_IN: changing kạ̄šur to کٲشُر
kw_GB: changing Kernowek to kernewek
ky_KG: changing Кыргызча to кыргызча
lg_UG: changing Oluganda to Luganda
lt_LT: changing lietuvių kalba to lietuvių
lv_LV: changing latviešu valoda to latviešu
mk_MK: changing македонск/и јазик to македонски
mn_MN: changing Монгол хэл to монгол
nb_NO: changing Bokmål to norsk bokmål
nn_NO: changing Nynorsk to nynorsk
os_RU: lang_name: setting to ирон
ru_RU: lang_name: setting to русский
ru_UA: lang_name: setting to русский
se_NO: changing Davvisámegiella to davvisámegiella
sk_SK: lang_name: setting to slovenčina
ta_IN: lang_name: setting to தமிழ்
ta_LK: lang_name: setting to தமிழ்
tk_TM: changing Türkmençe to türkmençe
tr_CY: changing Turkish to Türkçe
tr_TR: changing Turkish to Türkçe
ur_IN: lang_name: setting to {اردو}
ur_PK: lang_name: setting to {اردو}
vi_VN: changing Việt ngữ to Tiếng Việt
yo_NG: changing Yorùbá to Èdè Yorùbá
zu_ZA: changing IsiZulu to isiZulu
Most of these are simple case changes, but they match the CLDR db.
A search for a few of the others suggests they're also correct.
[BZ #21217]
* localedata/locales/ln_CD (mon): Months imported from CLDR, all changed:
"Yanwáli" to "sánzá ya yambo", "Febwáli" to "sánzá ya míbalé" and so on.
* locales/mn_MN (mon): reworded "Хулгана сарын" to "Нэгдүгээр сар",
"Үхэр сарын" to "Хоёрдугаар сар", and so on.
* locales/vi_VN (mon): reworded "Tháng một" to "Tháng 1", "Tháng hai"
to "Tháng 2", "Tháng ba" to "Tháng 3", and so on.
* locales/yo_NG (mon): reworded "Jánúárì" to "Oṣù Ṣẹ́rẹ́", "Fẹ́búárì"
to "Oṣù Èrèlè", "Máàṣì" to "Oṣù Ẹrẹ̀nà", and so on.
* locales/zu_ZA (mon): reworded "uMasingana" to "Januwari",
"uNhlolanja" to "Februwari", "uNdasa" to "Mashi", and so on.
[BZ #21217]
* localedata/locales/be_BY (mon, abmon): Reworded "Травень" ("travyen'")
and abbreviated "Тра" ("tra") to "Май" ("may").
* localedata/locales/be_BY@latin (mon, abmon): Likewise, "Travień"
and abbreviated "Tra" reworded to "Maj".
* localedata/locales/br_FR (day, abmon, mon, d_t_fmt): Use the proper
Unicode apostrophe (Cʼhwevrer, mercʼher, Dʼar).
* localedata/locales/es_PE (mon, abmon): Reworded "septiembre" to
"setiembre" and abbreviated "sep" to "set".
* localedata/locales/es_UY: Likewise.
* localedata/locales/fil_PH (mon): Reworded "Septiyembre" to "Setyembre"
and "Nobiyembre" to "Nobyembre".
* localedata/locales/fur_IT (mon, abmon): Reworded "Decembar" to
"Dicembar" and abbreviated "Dec" to "Dic".
* localedata/locales/fy_NL (mon): Reworded "Janaris" to "Jannewaris".
* localedata/locales/ha_NG (mon, abmon): Reworded "Fabrairu" to
"Faburairu" and "Afrilu" to "Afirilu", also abbreviated "Afr" to "Afi".
* localedata/locales/ig_NG (mon, abmon): All months begin with
uppercase. Reworded: "febụrụwarị" to "Febrụwarị", "epreel" to "Eprel",
"ọgọstụ" to "Ọgọọst", "nọvemba" to "Novemba", "nọv" to "Nov".
* localedata/locales/kw_GB (mon): Months imported from CLDR, many
changes: all "Mys" to "mis", "Mys Whevrel" to "mis Hwevrer", "Mys Merth"
to "mis Meurth", "Mys Evan" to "mis Metheven", and so on.
(abmon): "Whe>" to "Hwe", "Mer" to "Meu", "Evn" to "Met".
* localedata/locales/lg_UG (mon): Reworded "Julaai" to "Julaayi".
* localedata/locales/ln_CD (mon): Months imported from CLDR, all changed:
"Yanwáli" to "sánzá ya yambo", "Febwáli" to "sánzá ya míbalé" and so on.
* localedata/locales/mg_MG (mon, abmon): All months now begin with
uppercase.
* localedata/locales/se_NO (mon): Months imported from CLDR, all
suffixes changed from "mánu" to "mánnu".
* localedata/locales/sr_RS@latin (mon): Reworded "juni" to "jun" and
"juli" to "jul".
* localedata/locales/uz_UZ (mon): Reworded "Sentyabr" to "Sentabr"
and "Oktyabr" to "Oktabr".
* localedata/locales/uz_UZ@cyrillic (mon): Most of the names reworded,
no longer end with a soft sign.
* localedata/locales/wae_CH (mon): reworded "Bráchet" to "Bráčet",
"Öigschte" to "Öigšte", "Herbschtmánet" to "Herbštmánet",
"Chrischtmánet" to "Chrištmánet".
* Unicode 10.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 10.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
<locale.h> is specified to define locale_t in POSIX.1-2008, and so are
all of the headers that define functions that take locale_t arguments.
Under _GNU_SOURCE, the additional headers that define such functions
have also always defined locale_t. Therefore, there is no need to use
__locale_t in public function prototypes, nor in any internal code.
* ctype/ctype-c99_l.c, ctype/ctype.h, ctype/ctype_l.c
* include/monetary.h, include/stdlib.h, include/time.h
* include/wchar.h, locale/duplocale.c, locale/freelocale.c
* locale/global-locale.c, locale/langinfo.h, locale/locale.h
* locale/localeinfo.h, locale/newlocale.c
* locale/nl_langinfo_l.c, locale/uselocale.c
* localedata/bug-usesetlocale.c, localedata/tst-xlocale2.c
* stdio-common/vfscanf.c, stdlib/monetary.h, stdlib/stdlib.h
* stdlib/strfmon_l.c, stdlib/strtod_l.c, stdlib/strtof_l.c
* stdlib/strtol.c, stdlib/strtol_l.c, stdlib/strtold_l.c
* stdlib/strtoll_l.c, stdlib/strtoul_l.c, stdlib/strtoull_l.c
* string/strcasecmp.c, string/strcoll_l.c, string/string.h
* string/strings.h, string/strncase.c, string/strxfrm_l.c
* sysdeps/ieee754/float128/strtof128_l.c
* sysdeps/ieee754/float128/wcstof128.c
* sysdeps/ieee754/float128/wcstof128_l.c
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c
* sysdeps/ieee754/ldbl-64-128/strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
* sysdeps/ieee754/ldbl-opt/nldbl-strfmon_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-wcstold_l.c
* sysdeps/powerpc/powerpc32/power7/strcasecmp.S
* sysdeps/powerpc/powerpc64/power7/strcasecmp.S
* sysdeps/x86_64/strcasecmp_l-nonascii.c
* sysdeps/x86_64/strncase_l-nonascii.c, time/strftime_l.c
* time/strptime_l.c, time/time.h, wcsmbs/mbsrtowcs_l.c
* wcsmbs/wchar.h, wcsmbs/wcscasecmp.c, wcsmbs/wcsncase.c
* wcsmbs/wcstod.c, wcsmbs/wcstod_l.c, wcsmbs/wcstof.c
* wcsmbs/wcstof_l.c, wcsmbs/wcstol_l.c, wcsmbs/wcstold.c
* wcsmbs/wcstold_l.c, wcsmbs/wcstoll_l.c, wcsmbs/wcstoul_l.c
* wcsmbs/wcstoull_l.c, wctype/iswctype_l.c
* wctype/towctrans_l.c, wctype/wcfuncs_l.c
* wctype/wctrans_l.c, wctype/wctype.h, wctype/wctype_l.c:
Change all uses of __locale_t to locale_t.
[BZ #21207]
* locales/ce_RU (day): Updated (imported) from CLDR. Uppercase letters
left unchanged.
* locales/ce_RU (abday): Minor updates to match (day): Latin uppercase
"I" replaced with Cyrillic "Ӏ" ("Palochka", Unicode: U04C0). Trailing
spaces removed.
Despite the fact that el_GR is ISO-8859-7:2003 which contains the euro
symobl, it is not possible to know this apriori to selecting the el_GR
locale. Therefore you don't know if el_GR can possibly have the 2003
ammendments which include the euro symbol. This is resolved by creating
an el_GR@euro locale similar to all the other @euro locales for non-UTF8
charsets.
Fix the incorrect sorting order of a digraph and its geminated variant,
regression introduced by a faulty fix to bug 13547 in commit
b008d4c856.
Fix two inconsistencies in sorting unusual capitalization of digraphs
(bug #18587).
Enable DIACRIT_FORWARD to work around bug #17750.
Sort foreign accents after the Hungarian ones.
Add extensive unittests containing all the examples from The Rules of
Hungarian Orthography and many more, including explanatory comments.
* Unicode 9.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 9.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
Both regexes end with a "*." which means the previous match can be
omitted, and then the . allows them to match any input at all.
This means tools like coreutils' `rm -i` will always delete things
when prompted because the yesexpr regex matches all inputs (even
the negative ones).
Microsoft long ago added a mapping for 0x80 to the Euro sign to their
CP936. While GBK 1.0 doesn't include this mapping, it is compatible,
and Microsoft and glibc alias the two codepages. We could split them
apart so GBK wouldn't include the mapping, but that seems like a lot
of work for little gain.
The standard currently in effect (LST ISO 8601:1997) mandates the use
of hyphens (as opposed to full stops, currently) in date formats. It
also matches current CLDR data (v29), Wikipedia's & Wikia's settings,
and Microsoft's Lithuanian Style Guide.
According to "Requirements of information technology in Estonian
language and cultural environment" the monetary symbol should be
written after the amount number:
https://www.evs.ee/products/evs-8-2008
The date format in en_CA/LC_TIME specifies the date format as "%d/%m/%y".
However, it should be "%Y-%m-%d". This is the standard date format in
Canada as specified by the Canadian Standards Association in CSA Z234.5:1989,
which adopts the ISO 8601 standard.
Here's the web page from the National Research Council of Canada
citing ISO 8601 as the standard date/time format in Canada:
http://www.nrc-cnrc.gc.ca/eng/services/time/faq/#Q8
International Standard ISO 8601 specifies numeric representations
of date and time. The recommended full format is of the form
2001-12-31 23:59:28.73 UTC. The intent of this standard is to avoid
confusion in international communications which can arise with the
many different national notations. This format has the advantage
that it permits dates to be readily sorted in chronological order
by computer systems.
Windows 8+ and OS X also switched to this format.
Fix the postal_fmt and country_name entries to continue on the following
line without indentation.
localedata/Changelog:
* locales/de_LI (postal_fmt): Fix indentation.
(country_name): Likewise.
The Principality of Liechtenstein currently does not have a corresponding
locale. Given the links with Switzerland, the best is to base the locale
on the de_CH one (German is the official language) and only change the
country related categories: LC_ADDRESS. and LC_TELEPHONE.
localedata/Changelog:
* locales/de_LI: New locale.
* SUPPORTED: Add de_LI.
Some of the newer symbols we're using are missing translit entries which
causes troubles when generating the locales with older encodings.
tr_TR: ₺ -> "TL"
uz_UZ: ʻ -> "'"
common:
֏ -> "AMD"
₪ -> "ILS"
₱ -> "PHP"
₸ -> "KZT"
₾ -> "GEL"
The current test code doesn't check the return value of malloc.
This should rarely (if ever) cause a problem, but rather than add
some return value checks, just statically allocate the buffer on
the stack. This will never fail (or if it does, we've got much
bigger problems that don't matter to the test).
The yes/no strings should be based on the dictionary words. That means
they are capitalized based on the dictionary rather than position in the
sentence (e.g. the first word).
bo_CN: nostr: changing མེན to མིན།
bo_CN: yesstr: changing ཨིན to ཡིན།
dz_BT: nostr: changing མེན to མེན་
dz_BT: yesstr: changing ཨིན to ཨིན་
en_CA: yesstr: changing Yes to yes
en_CA: nostr: changing No to no
en_US: yesstr: changing Yes to yes
en_US: nostr: changing No to no
es_ES: nostr: changing No to no
es_ES: yesstr: changing Si to sí
fi_FI: nostr: changing Ei to ei
fi_FI: yesstr: changing Kyllä to kyllä
ig_NG: yesstr: changing Ee to Eye
ko_KR: nostr: changing 아니오 to 아니요
ky_KG: nostr: changing Жок to жок
ky_KG: yesstr: changing Ооба to ооба
ms_MY: nostr: changing Tidak to tidak
ms_MY: yesstr: changing Ya to ya
te_IN: nostr: changing కాదు to వద్దు
te_IN: yesstr: changing అవను to అవును
ur_PK: nostr: changing نهيں to نہیں
ur_PK: yesstr: changing بلكل to ہاں
uz_UZ: nostr: changing Yo'q to yo‘q
uz_UZ: yesstr: changing Ha to ha
uz_UZ@cyrillic: nostr: changing Йўқ to йўқ
uz_UZ@cyrillic: yesstr: changing Ҳа to ҳа
wae_CH: nostr: changing Nei to nei
wae_CH: yesstr: changing Ja to ja
yo_NG: nostr: changing Bẹ́ẹ̀ kọ́ to Bẹ́ẹ̀kọ́
yo_NG: yesstr: changing Bẹ́ẹ̀ ni to Bẹ́ẹ̀ni
Some of the translations were just wrong.
el_GR: nostr: changing no to όχι
el_GR: yesstr: changing yes to ναι
km_KH: nostr: changing no:NO:n:N to ទេ៖ n
km_KH: yesstr: changing yes:YES:y:Y to បាទ/ចាស៖ y
ug_CN: nostr: changing No to ياق
ug_CN: yesstr: changing Yes to ھەئە
Add missing translations for a number of locales:
af_ZA: nostr: setting to nee
af_ZA: yesstr: setting to ja
am_ET: nostr: setting to አይ
am_ET: yesstr: setting to አዎን
ast_ES: nostr: setting to non
ast_ES: yesstr: setting to sí
be_BY: nostr: setting to не
be_BY: yesstr: setting to так
bem_ZM: nostr: setting to Awe
bem_ZM: yesstr: setting to Ee
bg_BG: nostr: setting to не
bg_BG: yesstr: setting to да
brx_IN: nostr: setting to नहीं
brx_IN: yesstr: setting to हाँ
bs_BA: nostr: setting to ne
bs_BA: yesstr: setting to da
ca_ES: nostr: setting to no
ca_ES: yesstr: setting to sí
da_DK: nostr: setting to nej
da_DK: yesstr: setting to ja
de_DE: nostr: setting to nein
de_DE: yesstr: setting to ja
en_DK: nostr: setting to yes
en_DK: yesstr: setting to no
et_EE: nostr: setting to ei
et_EE: yesstr: setting to jah
eu_ES: nostr: setting to ez
eu_ES: yesstr: setting to bai
fa_IR: nostr: setting to نه
fa_IR: yesstr: setting to بله
ff_SN: nostr: setting to Alaa
ff_SN: yesstr: setting to Eey
fo_FO: nostr: setting to nei
fo_FO: yesstr: setting to já
fr_BE: nostr: setting to non
fr_BE: yesstr: setting to oui
fr_CH: nostr: setting to non
fr_CH: yesstr: setting to oui
fr_FR: nostr: setting to non
fr_FR: yesstr: setting to oui
fr_LU: nostr: setting to non
fr_LU: yesstr: setting to oui
fur_IT: nostr: setting to no
fur_IT: yesstr: setting to sì
fy_DE: nostr: setting to nee
fy_DE: yesstr: setting to ja
ga_IE: nostr: setting to níl
ga_IE: yesstr: setting to tá
gd_GB: nostr: setting to chan eil
gd_GB: yesstr: setting to tha
gl_ES: nostr: setting to non
gl_ES: yesstr: setting to si
gu_IN: nostr: setting to નહીં
gu_IN: yesstr: setting to હા
he_IL: nostr: setting to לא
he_IL: yesstr: setting to כן
hi_IN: nostr: setting to नहीं
hi_IN: yesstr: setting to हाँ
hr_HR: nostr: setting to ne
hr_HR: yesstr: setting to da
hu_HU: nostr: setting to nem
hu_HU: yesstr: setting to igen
id_ID: nostr: setting to tidak
id_ID: yesstr: setting to ya
is_IS: nostr: setting to nei
is_IS: yesstr: setting to já
it_CH: nostr: setting to no
it_CH: yesstr: setting to sì
it_IT: nostr: setting to no
it_IT: yesstr: setting to sì
ka_GE: nostr: setting to არა
ka_GE: yesstr: setting to კი
kk_KZ: nostr: setting to жоқ
kk_KZ: yesstr: setting to иә
kl_GL: nostr: setting to naagga
kl_GL: yesstr: setting to aap
kn_IN: nostr: setting to ಇಲ್ಲ
kn_IN: yesstr: setting to ಹೌದು
ko_KR: yesstr: setting to 예
lb_LU: nostr: setting to nee
lb_LU: yesstr: setting to jo
lg_UG: nostr: setting to Nedda
lg_UG: yesstr: setting to Ye
lt_LT: nostr: setting to ne
lt_LT: yesstr: setting to taip
lv_LV: nostr: setting to nē
lv_LV: yesstr: setting to jā
mg_MG: nostr: setting to Tsia
mg_MG: yesstr: setting to Eny
mn_MN: nostr: setting to үгүй
mn_MN: yesstr: setting to тийм
mr_IN: nostr: setting to नाहीःना
mr_IN: yesstr: setting to होयःहो
mt_MT: nostr: setting to le
mt_MT: yesstr: setting to iva
nb_NO: nostr: setting to nei
nb_NO: yesstr: setting to ja
ne_NP: nostr: setting to होइन
ne_NP: yesstr: setting to हो
nl_NL: nostr: setting to nee
nl_NL: yesstr: setting to ja
nn_NO: nostr: setting to nei
nn_NO: yesstr: setting to ja
or_IN: nostr: setting to ନା
or_IN: yesstr: setting to ହଁ
os_RU: nostr: setting to нӕйы
os_RU: yesstr: setting to уойы
pa_IN: nostr: setting to ਨਹੀਂ
pa_IN: yesstr: setting to ਹਾਂ
pl_PL: nostr: setting to nie
pl_PL: yesstr: setting to tak
pt_BR: nostr: setting to não
pt_BR: yesstr: setting to sim
pt_PT: nostr: setting to não
pt_PT: yesstr: setting to sim
ro_RO: nostr: setting to nu
ro_RO: yesstr: setting to da
ru_RU: nostr: setting to нет
ru_RU: yesstr: setting to да
ru_UA: nostr: setting to нет
ru_UA: yesstr: setting to да
se_NO: nostr: setting to ii
se_NO: yesstr: setting to jo
sl_SI: nostr: setting to ne
sl_SI: yesstr: setting to da
so_DJ: nostr: setting to maya
so_DJ: yesstr: setting to haa
so_SO: nostr: setting to maya
so_SO: yesstr: setting to haa
sq_AL: nostr: setting to jo
sq_AL: yesstr: setting to po
sr_RS@latin: nostr: setting to ne
sr_RS@latin: yesstr: setting to da
sr_RS: nostr: setting to не
sr_RS: yesstr: setting to да
sv_SE: nostr: setting to nej
sv_SE: yesstr: setting to ja
sw_KE: nostr: setting to Hapana
sw_KE: yesstr: setting to Ndiyo
yue_HK: nostr: setting to 唔係
yue_HK: yesstr: setting to 係
zu_ZA: nostr: setting to cha
zu_ZA: yesstr: setting to yebo
The vast majority of languages include yY/nN in their yes/no regexes.
Standardize the few that were missing them.
ms_MY: noexpr: add nN
nan_TW@latin: yesexpr: add yY
nan_TW@latin: noexpr: add nN
se_NO: noexpr: add nN
This also highlighted a few that were incorrectly using yY/nN because
they clashed with their localized messages:
uz_UZ: yesexpr: change ^[+1YyHh] to ^[+1ҲҳHh]
uz_UZ: noexpr: change ^[-0JjNn] to ^[-0ЙйNnYyJj]
uz_UZ@cyrillic: yesexpr: change ^[+1ҲҳYy] to ^[+1ҲҳHh]
uz_UZ@cyrillic: noexpr: change ^[-0ЙйNn] to [-0ЙйNnYyJj]
yo_NG: move nN (short for Bẹ́ẹ̀ni) from noexpr to yesexpr
A handful of regexes were allowing +1 for yesexpr and -0 for noexpr,
and it's the i18n definition. Standardize all locales by allowing
these language-independent values in them.
Example change for en_US goes from ^[yY] to ^[+1yY], and from ^[nN]
to ^[-0nN].
Tweak some of the collation settings for a few characters.
Add/update various fields:
LC_MESSAGES
yesstr: set to иә
nostr: set to жоқ
LC_MONETARY
mon_decimal_point: change . to ,
mon_thousands_sep: change to a non-breaking space
p_sep_by_space: change 1 to 2
set int_{p,n}_* fields
LC_NUMERIC
thousands_sep: change , to a non-breaking space
LC_TIME
abday: change saturday from Сн to Сб
LC_TELEPHONE
tel_dom_fmt: set to (%A) %l
int_select: set to 8~10
LC_ADDRESS:
country_post: set to KAZ
country_ab2: set to KZ
country_ab3: set to KAZ
country_isbn: set to 978-601
lang_name: set to қазақ тілі
I've spot checked a number of these, including some that were def
wrong (like ff_SN). It also fixes all open week-related bugs.
Since ff_SN is the only one that changes its base date, I also made
sure that its ordering of day translations were correct. Looks like
another case Petr brought up where the week field was not actually
checked against the day arrays.
I also took the opportunity to drop first_weekday/first_workday when
the value aligned with the defaults (1 & 2 respectively). This didn't
impact too many locales In practice because the majority omitted them
already.
A few locales were defining some values incorrectly for their region:
ak_GH: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ak_GH: first_weekday: changing 1 to 2
ayc_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
bem_ZM: week: changing [7, 19971130, 4] to [7, 19971130, 1]
bem_ZM: first_weekday: changing 1 to 2
en_IE: first_weekday: changing 2 to 1
en_US: week: changing [7, 19971130, 7] to [7, 19971130, 1]
es_CO: first_weekday: changing 2 to 1
es_ES: week: changing [7, 19971130, 5] to [7, 19971130, 4]
ff_SN: week: changing [7, 19971129, 1] to [7, 19971130, 1]
ff_SN: first_weekday: changing 1 to 2
ga_IE: first_weekday: changing 2 to 1
ht_HT: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ht_HT: first_weekday: changing 1 to 2
mk_MK: week: changing [7, 19971130, 4] to [7, 19971130, 1]
mt_MT: first_weekday: changing 2 to 1
quz_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
sr_ME: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS@latin: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: first_weekday: changing 2 to 1
uk_UA: week: changing [7, 19971130, 4] to [7, 19971130, 1]
unm_US: week: changing [7, 19971130, 4] to [7, 19971130, 1]
Some locales were copying locales that had the wrong week settings, so
that content had to be duplicated so the values could be adjusted:
el_CY: week: setting to [7, 19971130, 1]
en_AG: week: setting to [7, 19971130, 1]
en_AG: first_weekday: changing 2 to 1
en_ZM: week: setting to [7, 19971130, 1]
es_CU: week: setting to [7, 19971130, 1]
nl_AW: week: setting to [7, 19971130, 1]
sw_TZ: first_weekday: setting to 2
ta_LK: first_weekday: setting to 2
The majority of locales were omitting the week field thus getting the
default [7, 19971130, 0 (localedef) / 7 (ISO standard)]. Unfortunately,
neither of those are used by any locales, so we end up having to define
the field just to se the ndays field. In practice, this rarely matters
due to it usage, and the first two fields match the defaults.
aa_DJ: setting to [7, 19971130, 1]
aa_ER: setting to [7, 19971130, 1]
aa_ER@saaho: setting to [7, 19971130, 1]
aa_ET: setting to [7, 19971130, 1]
af_ZA: setting to [7, 19971130, 1]
am_ET: setting to [7, 19971130, 1]
an_ES: setting to [7, 19971130, 4]
anp_IN: setting to [7, 19971130, 1]
ar_AE: setting to [7, 19971130, 1]
ar_BH: setting to [7, 19971130, 1]
ar_DZ: setting to [7, 19971130, 1]
ar_EG: setting to [7, 19971130, 1]
ar_IN: setting to [7, 19971130, 1]
ar_IQ: setting to [7, 19971130, 1]
ar_JO: setting to [7, 19971130, 1]
ar_KW: setting to [7, 19971130, 1]
ar_LB: setting to [7, 19971130, 1]
ar_LY: setting to [7, 19971130, 1]
ar_MA: setting to [7, 19971130, 1]
ar_OM: setting to [7, 19971130, 1]
ar_QA: setting to [7, 19971130, 1]
ar_SA: setting to [7, 19971130, 1]
ar_SD: setting to [7, 19971130, 1]
ar_SS: setting to [7, 19971130, 1]
ar_SY: setting to [7, 19971130, 1]
ar_TN: setting to [7, 19971130, 1]
ar_YE: setting to [7, 19971130, 1]
as_IN: setting to [7, 19971130, 1]
ast_ES: setting to [7, 19971130, 4]
az_AZ: setting to [7, 19971130, 1]
be_BY: setting to [7, 19971130, 1]
be_BY@latin: setting to [7, 19971130, 1]
ber_DZ: setting to [7, 19971130, 1]
ber_MA: setting to [7, 19971130, 1]
bg_BG: setting to [7, 19971130, 4]
bhb_IN: setting to [7, 19971130, 1]
bho_IN: setting to [7, 19971130, 1]
bn_BD: setting to [7, 19971130, 1]
bn_IN: setting to [7, 19971130, 1]
bo_CN: setting to [7, 19971130, 1]
br_FR: setting to [7, 19971130, 4]
brx_IN: setting to [7, 19971130, 1]
bs_BA: setting to [7, 19971130, 1]
byn_ER: setting to [7, 19971130, 1]
ca_AD: setting to [7, 19971130, 4]
ca_ES: setting to [7, 19971130, 4]
ca_ES@euro: setting to [7, 19971130, 4]
ca_FR: setting to [7, 19971130, 4]
ca_IT: setting to [7, 19971130, 4]
ce_RU: setting to [7, 19971130, 1]
cmn_TW: setting to [7, 19971130, 1]
crh_UA: setting to [7, 19971130, 1]
cv_RU: setting to [7, 19971130, 1]
cy_GB: setting to [7, 19971130, 4]
de_BE: setting to [7, 19971130, 4]
de_LU: setting to [7, 19971130, 4]
doi_IN: setting to [7, 19971130, 1]
dv_MV: setting to [7, 19971130, 1]
dz_BT: setting to [7, 19971130, 1]
el_GR: setting to [7, 19971130, 4]
el_GR@euro: setting to [7, 19971130, 4]
en_AU: setting to [7, 19971130, 1]
en_BW: setting to [7, 19971130, 1]
en_CA: setting to [7, 19971130, 1]
en_HK: setting to [7, 19971130, 1]
en_IE: setting to [7, 19971130, 4]
en_IN: setting to [7, 19971130, 1]
en_NG: setting to [7, 19971130, 1]
en_NZ: setting to [7, 19971130, 1]
en_PH: setting to [7, 19971130, 1]
en_SG: setting to [7, 19971130, 1]
en_ZA: setting to [7, 19971130, 1]
en_ZW: setting to [7, 19971130, 1]
es_AR: setting to [7, 19971130, 1]
es_BO: setting to [7, 19971130, 1]
es_CL: setting to [7, 19971130, 1]
es_CO: setting to [7, 19971130, 1]
es_CR: setting to [7, 19971130, 1]
es_DO: setting to [7, 19971130, 1]
es_EC: setting to [7, 19971130, 1]
es_ES@euro: setting to [7, 19971130, 4]
es_GT: setting to [7, 19971130, 1]
es_HN: setting to [7, 19971130, 1]
es_MX: setting to [7, 19971130, 1]
es_NI: setting to [7, 19971130, 1]
es_PA: setting to [7, 19971130, 1]
es_PE: setting to [7, 19971130, 1]
es_PR: setting to [7, 19971130, 1]
es_PY: setting to [7, 19971130, 1]
es_SV: setting to [7, 19971130, 1]
es_US: setting to [7, 19971130, 1]
es_UY: setting to [7, 19971130, 1]
es_VE: setting to [7, 19971130, 1]
eu_ES: setting to [7, 19971130, 4]
fa_IR: setting to [7, 19971130, 1]
fil_PH: setting to [7, 19971130, 1]
fo_FO: setting to [7, 19971130, 4]
fr_CA: setting to [7, 19971130, 1]
fr_CH: setting to [7, 19971130, 4]
fr_LU: setting to [7, 19971130, 4]
fy_NL: setting to [7, 19971130, 4]
ga_IE: setting to [7, 19971130, 4]
gd_GB: setting to [7, 19971130, 4]
gez_ER: setting to [7, 19971130, 1]
gez_ET: setting to [7, 19971130, 1]
gl_ES: setting to [7, 19971130, 4]
gu_IN: setting to [7, 19971130, 1]
gv_GB: setting to [7, 19971130, 4]
hak_TW: setting to [7, 19971130, 1]
ha_NG: setting to [7, 19971130, 1]
he_IL: setting to [7, 19971130, 1]
hi_IN: setting to [7, 19971130, 1]
hne_IN: setting to [7, 19971130, 1]
hr_HR: setting to [7, 19971130, 1]
hy_AM: setting to [7, 19971130, 1]
id_ID: setting to [7, 19971130, 1]
ig_NG: setting to [7, 19971130, 1]
ik_CA: setting to [7, 19971130, 1]
is_IS: setting to [7, 19971130, 4]
it_CH: setting to [7, 19971130, 4]
it_IT: setting to [7, 19971130, 4]
it_IT@euro: setting to [7, 19971130, 4]
iu_CA: setting to [7, 19971130, 1]
ja_JP: setting to [7, 19971130, 1]
ka_GE: setting to [7, 19971130, 1]
kk_KZ: setting to [7, 19971130, 1]
kl_GL: setting to [7, 19971130, 1]
km_KH: setting to [7, 19971130, 1]
kn_IN: setting to [7, 19971130, 1]
kok_IN: setting to [7, 19971130, 1]
ko_KR: setting to [7, 19971130, 1]
ks_IN: setting to [7, 19971130, 1]
ks_IN@devanagari: setting to [7, 19971130, 1]
ku_TR: setting to [7, 19971130, 1]
kw_GB: setting to [7, 19971130, 4]
ky_KG: setting to [7, 19971130, 1]
lg_UG: setting to [7, 19971130, 1]
lij_IT: setting to [7, 19971130, 4]
lo_LA: setting to [7, 19971130, 1]
lt_LT: setting to [7, 19971130, 4]
lv_LV: setting to [7, 19971130, 1]
lzh_TW: setting to [7, 19971130, 1]
mag_IN: setting to [7, 19971130, 1]
mai_IN: setting to [7, 19971130, 1]
mg_MG: setting to [7, 19971130, 1]
mhr_RU: setting to [7, 19971130, 1]
mi_NZ: setting to [7, 19971130, 1]
ml_IN: setting to [7, 19971130, 1]
mni_IN: setting to [7, 19971130, 1]
mn_MN: setting to [7, 19971130, 1]
mr_IN: setting to [7, 19971130, 1]
ms_MY: setting to [7, 19971130, 1]
mt_MT: setting to [7, 19971130, 1]
my_MM: setting to [7, 19971130, 1]
nan_TW: setting to [7, 19971130, 1]
nan_TW@latin: setting to [7, 19971130, 1]
ne_NP: setting to [7, 19971130, 1]
nhn_MX: setting to [7, 19971130, 1]
niu_NU: setting to [7, 19971130, 1]
niu_NZ: setting to [7, 19971130, 1]
nl_BE: setting to [7, 19971130, 4]
nl_BE@euro: setting to [7, 19971130, 4]
nr_ZA: setting to [7, 19971130, 1]
nso_ZA: setting to [7, 19971130, 1]
oc_FR: setting to [7, 19971130, 4]
om_ET: setting to [7, 19971130, 1]
om_KE: setting to [7, 19971130, 1]
or_IN: setting to [7, 19971130, 1]
os_RU: setting to [7, 19971130, 1]
pa_IN: setting to [7, 19971130, 1]
pap_AW: setting to [7, 19971130, 1]
pap_CW: setting to [7, 19971130, 1]
pa_PK: setting to [7, 19971130, 1]
ps_AF: setting to [7, 19971130, 1]
pt_BR: setting to [7, 19971130, 1]
pt_PT: setting to [7, 19971130, 4]
pt_PT@euro: setting to [7, 19971130, 4]
raj_IN: setting to [7, 19971130, 1]
ro_RO: setting to [7, 19971130, 1]
ru_RU: setting to [7, 19971130, 1]
ru_UA: setting to [7, 19971130, 1]
rw_RW: setting to [7, 19971130, 1]
sa_IN: setting to [7, 19971130, 1]
sat_IN: setting to [7, 19971130, 1]
sd_IN: setting to [7, 19971130, 1]
sd_IN@devanagari: setting to [7, 19971130, 1]
se_NO: setting to [7, 19971130, 4]
shs_CA: setting to [7, 19971130, 1]
sid_ET: setting to [7, 19971130, 1]
si_LK: setting to [7, 19971130, 1]
sl_SI: setting to [7, 19971130, 1]
so_DJ: setting to [7, 19971130, 1]
so_ET: setting to [7, 19971130, 1]
so_KE: setting to [7, 19971130, 1]
so_SO: setting to [7, 19971130, 1]
sq_AL: setting to [7, 19971130, 1]
ss_ZA: setting to [7, 19971130, 1]
st_ZA: setting to [7, 19971130, 1]
sv_FI: setting to [7, 19971130, 4]
sv_SE: setting to [7, 19971130, 4]
ta_IN: setting to [7, 19971130, 1]
tcy_IN: setting to [7, 19971130, 1]
te_IN: setting to [7, 19971130, 1]
tg_TJ: setting to [7, 19971130, 1]
the_NP: setting to [7, 19971130, 1]
th_TH: setting to [7, 19971130, 1]
ti_ER: setting to [7, 19971130, 1]
ti_ET: setting to [7, 19971130, 1]
tig_ER: setting to [7, 19971130, 1]
tk_TM: setting to [7, 19971130, 1]
tl_PH: setting to [7, 19971130, 1]
tn_ZA: setting to [7, 19971130, 1]
tr_CY: setting to [7, 19971130, 1]
tr_TR: setting to [7, 19971130, 1]
ts_ZA: setting to [7, 19971130, 1]
tt_RU: setting to [7, 19971130, 1]
tt_RU@iqtelif: setting to [7, 19971130, 1]
ug_CN: setting to [7, 19971130, 1]
ur_IN: setting to [7, 19971130, 1]
ur_PK: setting to [7, 19971130, 1]
uz_UZ: setting to [7, 19971130, 1]
uz_UZ@cyrillic: setting to [7, 19971130, 1]
ve_ZA: setting to [7, 19971130, 1]
vi_VN: setting to [7, 19971130, 1]
wa_BE: setting to [7, 19971130, 4]
wal_ET: setting to [7, 19971130, 1]
wo_SN: setting to [7, 19971130, 1]
xh_ZA: setting to [7, 19971130, 1]
yi_US: setting to [7, 19971130, 1]
yo_NG: setting to [7, 19971130, 1]
yue_HK: setting to [7, 19971130, 1]
zh_CN: setting to [7, 19971130, 1]
zh_HK: setting to [7, 19971130, 1]
zh_SG: setting to [7, 19971130, 1]
zh_TW: setting to [7, 19971130, 1]
zu_ZA: setting to [7, 19971130, 1]
Finally, set first_weekday in all the locales that were omitting it
and wanted something other than the default of 1.
aa_DJ: setting to 7
aa_ER: setting to 2
aa_ER@saaho: setting to 2
ar_AE: setting to 7
ar_BH: setting to 7
ar_DZ: setting to 7
ar_EG: setting to 7
ar_IQ: setting to 7
ar_JO: setting to 7
ar_KW: setting to 7
ar_LB: setting to 2
ar_LY: setting to 7
ar_MA: setting to 7
ar_OM: setting to 7
ar_QA: setting to 7
ar_SD: setting to 7
ar_SS: setting to 2
ar_SY: setting to 7
az_AZ: setting to 2
be_BY: setting to 2
be_BY@latin: setting to 2
ber_DZ: setting to 7
ber_MA: setting to 7
bn_BD: setting to 6
bs_BA: setting to 2
byn_ER: setting to 2
dv_MV: setting to 6
en_NG: setting to 2
es_BO: setting to 2
es_CL: setting to 2
es_EC: setting to 2
es_UY: setting to 2
fo_FO: setting to 2
fr_CH: setting to 2
gd_GB: setting to 2
gez_ER: setting to 2
ha_NG: setting to 2
hr_HR: setting to 2
hy_AM: setting to 2
ig_NG: setting to 2
is_IS: setting to 2
it_CH: setting to 2
ka_GE: setting to 2
kk_KZ: setting to 2
kl_GL: setting to 2
ku_TR: setting to 2
ky_KG: setting to 2
lg_UG: setting to 2
mg_MG: setting to 2
mn_MN: setting to 2
ms_MY: setting to 2
niu_NU: setting to 2
pap_AW: setting to 2
pap_CW: setting to 2
pt_PT: setting to 2
pt_PT@euro: setting to 2
rw_RW: setting to 2
se_NO: setting to 2
si_LK: setting to 2
so_DJ: setting to 7
so_SO: setting to 2
sq_AL: setting to 2
tg_TJ: setting to 2
ti_ER: setting to 2
tig_ER: setting to 2
tk_TM: setting to 2
tt_RU: setting to 2
tt_RU@iqtelif: setting to 2
uz_UZ: setting to 2
uz_UZ@cyrillic: setting to 2
vi_VN: setting to 2
wo_SN: setting to 2
yo_NG: setting to 2
A bunch of locales were copying the wrong source locale -- looks like they
were basically TODOs from the original imports. This lead to bad values
for int_prefix for them.
Very few locales set audience/application/abbreviation, and
even the ones that do, set them largely to default/useless
values. Drop them from the few locales until we decide we
want to set these everywhere (to something useful).
This updates a few locales based on CLDR v29 data. I've verified most by
hand while the rest I know are correct.
For int_curr_symbol, it should be 3 characters followed by a space:
ar_SS: changing SDG to SSP
bem_ZM: changing ZMK to ZMW
dz_BT: changing BTN to BTN # Just changing " " to "<U0020>".
en_ZW: changing ZWD to USD
es_SV: changing SVC to USD
lv_LV: changing LVL to EUR
ne_NP: changing INR to NPR
pap_AW: changing ANG to AWG
the_NP: changing INR to NPR
Some of these require updates iso-4217.def.
For currency_symbol, it should be the standard/localized symbol name:
aa_DJ: changing $ to Fdj
ar_SA: changing ريال to ر.س
ar_SS: changing ج.س. to £
az_AZ: changing man. to ₼
bg_BG: changing лв to лв.
ce_RU: changing руб to ₽
crh_UA: changing gr to ₴
cv_RU: changing t to ₽
de_CH: changing Fr. to CHF
dz_BT: changing དངུལ་ཀྲམ་ to Nu.
en_BW: changing Pu to P
en_DK: changing ¤ to kr.
en_PH: changing Php to ₱
en_ZW: changing Z$ to $
es_BO: changing $b to Bs
es_DO: changing $ to RD$
es_HN: changing L. to L
es_PA: changing B/ to B/.
es_SV: changing ₡ to $
fil_PH: changing PhP to ₱
he_IL: changing שח to ₪
hy_AM: changing Դ to ֏
ka_GE: changing ლ to ₾
kk_KZ: changing тг to ₸
ko_KR: changing ₩ to ₩
lg_UG: changing /- to USh
lv_LV: changing Ls to €
mg_MG: changing AR to Ar
mhr_RU: changing ТЕҤ to ₽
my_MM: changing Ks to K
os_RU: changing сом to ₽
pap_AW: changing f to ƒ
pap_CW: changing f to ƒ
ps_AF: changing افغانۍ to ؋
rw_RW: changing Frw to FRw
ru_RU: changing руб to ₽
ru_UA: changing гр to ₴
sd_IN@devanagari: changing रु to ₹
se_NO: changing ru to kr
si_LK: changing ₨ to රු
so_SO: changing $ to S
sq_AL: changing Lek to L
ti_ER: changing $ to Nfk
ti_ET: changing $ to Br
tl_PH: changing PhP to ₱
tr_TR: changing TL to ₺
tt_RU: changing руб to ₽
tt_RU@iqtelif: changing sum to ₽
uz_UZ: changing so'm to soʻm
Note: Some of the characters might not render as they're still quite new
in the Unicode database.
Currently localedef accepts any value for the category keyword. This has
allowed bad values to propagate to the vast majority of locales (~90%).
Add some logic to only accept a few standards.
The ISO 30112 standard defines the valid values for the category
keyword as only a few options:
posix:1993
i18n:2004
i18n:2012
The vast majority of locales had changed the "i18n" string to the
name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
date (presumably thinking it should be the date of submission).
Convert all of them to "i18n:2012" for consistency. A follow up
change will update localedef to actually check/validate the field.
This updates a bunch of locales based on CLDR v29 data:
bg_BG: changing Bulgaria to България
bo_CN: changing ཀྲུང་ཧྭ་མི་དམངས་སྤྱི་མཐུན་རྒྱལ་ཁབ། to རྒྱ་ནག
bo_IN: changing རྒྱ་གར to རྒྱ་གར་
cy_GB: changing Cymru to Y Deyrnas Unedig
dz_BT: changing འབྲུག། to འབྲུག
en_US: changing USA to United States
es_US: changing USA to Estados Unidos
gd_GB: changing Breatainn Mhòr to An Rìoghachd Aonaichte
ha_NG: changing Nigeria to Najeriya
mk_MK: changing Macedonia to Македонија
mn_MN: changing Mongolia to Монгол
sq_MK: changing Macedonia to Maqedoni
sr_RS@latin: changing Srbija i Crna Gora to Srbija
tr_CY: changing Northern Cyprus to Kıbrıs
tr_TR: changing Turkey to Türkiye
ug_CN: changing 中华人民共和国 to جۇڭگو
uz_UZ: changing O'zbekistan to Oʻzbekiston
vi_VN: changing Việt nam to Việt Nam
wae_CH: changing Switzerland to Schwiz
yi_US: changing די פֿאראײניקטע שטאַטן to פֿאַראייניגטע שטאַטן
yo_NG: changing Nigeria to Orílẹ́ède Nàìjíríà
yue_HK: changing 香港 to 中華人民共和國香港特別行政區
zu_ZA: changing Mzansi Afrika to i-South Africa
These all look largely straightforward. Many had English translations
instead of native, and a few have been updated. I can't verify some of
them as I'm not personally familiar, but the CLDR data matches.
The USA->United States seems a little odd, but that is also what the
CLDR database uses everywhere (rather than "United States of America").
We can also fill in a country name where there wasn't one before.
Many look correct to me (mostly the English ones), but there's also
many that I have no idea. But it can't be worse than leaving it
blank ? :)
ar_AE: changing to الإمارات العربية المتحدة
ar_BH: changing to البحرين
ar_DZ: changing to الجزائر
ar_EG: changing to مصر
ar_IN: changing to الهند
ar_IQ: changing to العراق
ar_JO: changing to الأردن
ar_KW: changing to الكويت
ar_LB: changing to لبنان
ar_LY: changing to ليبيا
ar_MA: changing to المغرب
ar_OM: changing to عُمان
ar_QA: changing to قطر
ar_SA: changing to المملكة العربية السعودية
ar_SD: changing to السودان
ar_SS: changing to جنوب السودان
ar_SY: changing to سوريا
ar_TN: changing to تونس
ar_YE: changing to اليمن
as_IN: changing to ভাৰত
ast_ES: changing to España
az_AZ: changing to Azərbaycan
be_BY: changing to Беларусь
bn_IN: changing to ভারত
br_FR: changing to Frañs
brx_IN: changing to भारत
bs_BA: changing to Bosna i Hercegovina
ca_AD: changing to Andorra
ca_ES: changing to Espanya
ca_FR: changing to França
ca_IT: changing to Itàlia
ce_RU: changing to Росси
da_DK: changing to Danmark
de_AT: changing to Österreich
de_BE: changing to Belgien
de_CH: changing to Schweiz
de_LU: changing to Luxemburg
el_CY: changing to Κύπρος
el_GR: changing to Ελλάδα
en_AG: changing to Antigua & Barbuda
en_AU: changing to Australia
en_BW: changing to Botswana
en_CA: changing to Canada
en_DK: changing to Denmark
en_GB: changing to United Kingdom
en_HK: changing to Hong Kong SAR China
en_IE: changing to Ireland
en_IN: changing to India
en_NZ: changing to New Zealand
en_PH: changing to Philippines
en_SG: changing to Singapore
en_ZW: changing to Zimbabwe
es_AR: changing to Argentina
es_BO: changing to Bolivia
es_CL: changing to Chile
es_CO: changing to Colombia
es_CU: changing to Cuba
es_DO: changing to República Dominicana
es_EC: changing to Ecuador
es_ES: changing to España
es_GT: changing to Guatemala
es_HN: changing to Honduras
es_MX: changing to México
es_NI: changing to Nicaragua
es_PA: changing to Panamá
es_PE: changing to Perú
es_PR: changing to Puerto Rico
es_PY: changing to Paraguay
es_SV: changing to El Salvador
es_UY: changing to Uruguay
es_VE: changing to Venezuela
eu_ES: changing to Espainia
fil_PH: changing to Pilipinas
fo_FO: changing to Føroyar
fr_BE: changing to Belgique
fr_CA: changing to Canada
fr_CH: changing to Suisse
fr_FR: changing to France
fr_LU: changing to Luxembourg
fur_IT: changing to Italie
fy_DE: changing to Dútslân
fy_NL: changing to Nederlân
ga_IE: changing to Éire
gl_ES: changing to España
gu_IN: changing to ભારત
gv_GB: changing to Rywvaneth Unys
he_IL: changing to ישראל
hi_IN: changing to भारत
hr_HR: changing to Hrvatska
hu_HU: changing to Magyarország
id_ID: changing to Indonesia
is_IS: changing to Ísland
it_CH: changing to Svizzera
it_IT: changing to Italia
ja_JP: changing to 日本
ka_GE: changing to საქართველო
kk_KZ: changing to Қазақстан
kl_GL: changing to Kalaallit Nunaat
kn_IN: changing to ಭಾರತ
kok_IN: changing to भारत
ko_KR: changing to 대한민국
ks_IN: changing to ہِنٛدوستان
ks_IN@devanagari: changing to भारत
kw_GB: changing to Rywvaneth Unys
ky_KG: changing to Кыргызстан
lt_LT: changing to Lietuva
lv_LV: changing to Latvija
mg_MG: changing to Madagasikara
ml_IN: changing to ഇന്ത്യ
mr_IN: changing to भारत
ms_MY: changing to Malaysia
mt_MT: changing to Malta
nb_NO: changing to Norge
ne_NP: changing to नेपाल
nl_AW: changing to Aruba
nl_BE: changing to België
nl_NL: changing to Nederland
nn_NO: changing to Noreg
or_IN: changing to ଭାରତ
os_RU: changing to Уӕрӕсе
pa_IN: changing to ਭਾਰਤ
pa_PK: changing to ਪਾਕਿਸਤਾਨ
pl_PL: changing to Polska
pt_BR: changing to Brasil
pt_PT: changing to Portugal
ru_RU: changing to Россия
ru_UA: changing to Украина
sd_IN@devanagari: changing to भारत
se_NO: changing to Norga
si_LK: changing to ශ්රී ලංකාව
sk_SK: changing to Slovensko
sl_SI: changing to Slovenija
sq_AL: changing to Shqipëri
sv_SE: changing to Sverige
ta_IN: changing to இந்தியா
ta_LK: changing to இலங்கை
ur_IN: changing to بھارت
ur_PK: changing to پاکستان
These entries have been checked mostly against Wikipedia, but also using
the sources it cites (like the UN and other treaty sources).
Fix incorrect values:
en_BW: changing RB to BW
kl_GL: changing GRO to KN
km_KH: changing LAO to KH
my_MM: changing BA to MYA
oc_FR: changing F to F
tr_CY: changing TR to CY
wae_CH: changing DH to CH
Add missing entries:
aa_DJ: changing to DJI
ak_GH: changing to GH
ar_OM: changing to OM
ar_SS: changing to SUD
ar_YE: changing to YAR
bo_CN: changing to CHN
cmn_TW: changing to RC
dv_MV: changing to MV
dz_BT: changing to BHT
en_AG: changing to AG
es_HN: changing to HN
es_PR: changing to PR
hak_TW: changing to RC
lzh_TW: changing to RC
nan_TW: changing to RC
nan_TW@latin: changing to RC
nl_AW: changing to AUA
pap_AW: changing to AUA
so_DJ: changing to DJI
the_NP: changing to NEP
ug_CN: changing to CHN
yue_HK: changing to HK
zh_CN: changing to CHN
zh_HK: changing to HK
zh_TW: changing to RC
This updates a few locales based on CLDR v29 data.
Add missing fields:
as_IN: changing to 356
dv_MV: changing to 462
kk_KZ: changing to 398
my_MM: changing to 104
rw_RW: changing to 646
tt_RU: changing to 643
Update ones that are wrong:
dz_BT: changing BHU to 064
en_PH: changing 360 to 608
km_KH: changing 418 to 116
ky_KG: changing 643 to 417
tr_CY: changing 792 to 196
wo_SN: changing 450 to 686
As a result of fixing these, I had to update country_ab[23]:
dz_BT: changing BHU to BTN
en_PH: changing ID/IDN to PH/PHL
km_KH: changing LA/LAO to KH/KHM
ky_KG: changing KY/KYR to KG/KGZ
tr_CY: changing TR/TUR to CY/CYP
wo_SN: changing MG/MDG to SN/SEN
Pad with leading zeros to match the standard and other locales:
ber_DZ: changing 12 to 012
ca_AD: changing 20 to 020
en_AG: changing 28 to 028
hy_AM: changing 51 to 051
li_BE: changing 56 to 056
wa_BE: changing 56 to 056
I hand checked the first two sets against ISO 3166-1 directly.
There are only two page sizes that locales use: US-Letter and A4.
For the former, move to copying the en_US locale, while for the
latter, move to copying the i18n locale. This lets us clean up
all the stray comments like FIXME.
There should be no functional differences here.
There are only two measurement systems that locales use: US and metric.
For the former, move to copying the en_US locale, while for the latter,
move to copying the i18n locale. This lets us clean up all the stray
comments like FIXME.
There should be no functional differences here.
This updates all the territory fields based on CLDR v29 data. Many of
them were obviously incorrect where people used a two letter code and
not the English name.
aa_DJ: changing DJ to Djibouti
aa_ER@saaho: changing ER to Eritrea
aa_ER: changing ER to Eritrea
aa_ET: changing ET to Ethiopia
am_ET: changing ET to Ethiopia
ar_LY: changing Libyan Arab Jamahiriya to Libya
ar_SY: changing Syrian Arab Republic to Syria
bo_CN: changing P.R. of China to China
bs_BA: changing Bosnia and Herzegowina to Bosnia & Herzegovina
byn_ER: changing ER to Eritrea
ca_IT: changing Italy (L'Alguer) to Italy
ce_RU: changing RUSSIAN FEDERATION to Russia
cmn_TW: changing Republic of China to Taiwan
cy_GB: changing Great Britain to United Kingdom
de_LU@euro: changing Luxemburg to Luxembourg
de_LU: changing Luxemburg to Luxembourg
en_AG: changing Antigua and Barbuda to Antigua & Barbuda
en_GB: changing Great Britain to United Kingdom
en_HK: changing Hong Kong to Hong Kong SAR China
en_US: changing USA to United States
es_US: changing USA to United States
fr_LU@euro: changing Luxemburg to Luxembourg
fr_LU: changing Luxemburg to Luxembourg
fy_DE: changing DE to Germany
gd_GB: changing Great Britain to United Kingdom
gez_ER@abegede: changing ER to Eritrea
gez_ER: changing ER to Eritrea
gez_ET@abegede: changing ET to Ethiopia
gez_ET: changing ET to Ethiopia
gv_GB: changing Britain to United Kingdom
hak_TW: changing Republic of China to Taiwan
iu_CA: changing CA to Canada
ko_KR: changing Republic of Korea to South Korea
kw_GB: changing Britain to United Kingdom
li_BE: changing BE to Belgium
li_NL: changing NL to Netherlands
lzh_TW: changing Republic of China to Taiwan
my_MM: changing Myanmar to Myanmar (Burma)
nan_TW: changing Republic of China to Taiwan
nds_DE: changing DE to Germany
nds_NL: changing NL to Netherlands
om_ET: changing ET to Ethiopia
om_KE: changing KE to Kenya
pap_AW: changing AW to Aruba
pap_CW: changing CW to Curaçao
pt_BR: changing Brasil to Brazil
sid_ET: changing ET to Ethiopia
sk_SK: changing Slovak to Slovakia
so_DJ: changing DJ to Djibouti
so_ET: changing ET to Ethiopia
so_KE: changing KE to Kenya
so_SO: changing SO to Somalia
ti_ER: changing ER to Eritrea
ti_ET: changing ET to Ethiopia
tig_ER: changing ER to Eritrea
tt_RU@iqtelif: changing Tatarstan, Russian Federation to Russia
uk_UA: changing UA to Ukraine
unm_US: changing USA to United States
wal_ET: changing ET to Ethiopia
yi_US: changing USA to United States
yue_HK: changing Hong Kong to Hong Kong SAR China
zh_CN: changing P.R. of China to China
zh_HK: changing Hong Kong to Hong Kong SAR China
zh_TW: changing Taiwan R.O.C. to Taiwan
This updates all the language fields based on CLDR v29 data. Many of
them were obviously incorrect where people used a two letter code and
not the English name.
aa_DJ: changing aa to Afar
aa_ER: changing aa to Afar
aa_ER@saaho: changing aa to Afar
aa_ET: changing aa to Afar
am_ET: changing am to Amharic
az_AZ: changing Azeri to Azerbaijani
bn_BD: changing Bengali/Bangla to Bengali
byn_ER: changing byn to Blin
de_AT: changing German to Austrian German
de_CH: changing German to Swiss High German
en_AU: changing English to Australian English
en_CA: changing English to Canadian English
en_GB: changing English to British English
en_US: changing English to American English
es_ES: changing Spanish to European Spanish
es_MX: changing Spanish to Mexican Spanish
ff_SN: changing ff to Fulah
fr_CA: changing French to Canadian French
fr_CH: changing French to Swiss French
fur_IT: changing Furlan to Friulian
fy_DE: changing fy to Western Frisian
fy_NL: changing Frisian to Western Frisian
gd_GB: changing Scots Gaelic to Scottish Gaelic
gez_ER@abegede: changing gez to Geez
gez_ER: changing gez to Geez
gez_ET@abegede: changing gez to Geez
gez_ET: changing gez to Geez
gv_GB: changing Manx Gaelic to Manx
ht_HT: changing Kreyol to Haitian Creole
kl_GL: changing Greenlandic to Kalaallisut
lg_UG: changing Luganda to Ganda
li_BE: changing li to Limburgish
li_NL: changing li to Limburgish
nan_TW@latin: changing Minnan to Min Nan Chinese
nb_NO: changing Norwegian, Bokmål to Norwegian Bokmål
nds_DE: changing nds to Low German
nds_NL: changing nds to Low Saxon
niu_NU: changing Vagahau Niue (Niuean) to Niuean
niu_NZ: changing Vagahau Niue (Niuean) to Niuean
nl_BE: changing Dutch to Flemish
nn_NO: changing Norwegian, Nynorsk to Norwegian Nynorsk
nr_ZA: changing Southern Ndebele to South Ndebele
om_ET: changing om to Oromo
om_KE: changing om to Oromo
or_IN: changing Odia to Oriya
os_RU: changing Ossetian to Ossetic
pap_AW: changing pap to Papiamento
pap_CW: changing pap to Papiamento
pa_PK: changing Punjabi (Shahmukhi) to Punjabi
pt_BR: changing Portuguese to Brazilian Portuguese
pt_PT: changing Portuguese to European Portuguese
se_NO: changing Northern Saami to Northern Sami
sid_ET: changing sid to Sidamo
so_DJ: changing so to Somali
so_ET: changing so to Somali
so_KE: changing so to Somali
so_SO: changing so to Somali
st_ZA: changing Sotho to Southern Sotho
sw_KE: changing sw to Swahili
sw_TZ: changing sw to Swahili
ti_ER: changing ti to Tigrinya
ti_ET: changing ti to Tigrinya
tig_ER: changing tig to Tigre
uk_UA: changing uk to Ukrainian
wal_ET: changing wal to Wolaytta
yue_HK: changing Yue Chinese to Cantonese
There's no real value in populating this field when it's the same as the
default POSIX setting, so drop it from most locales so it's clear what's
going on.
These locales should be using A4 paper size rather than US-Letter.
Update the copy points to match the others in the file. All other
locales have been verified against the CLDR and hand checking.
From the bug:
Obsolete locale. The ISO-639 code for Hebrew was changed from 'iw'
to 'he' in 1989, according to Bruno Haible on libc-alpha 2003-09-01.
Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>
From the bug:
Netherlands Antilles was dissolved, and "AN" is not a part of ISO 3166
anymore. According to setlocale(3), "territory is an ISO 3166 country
code". We now have pap_AW and pap_CW.
Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>
This updates a bunch of locales based on CLDR v28 data:
ar_SS: int_prefix: changing 249 to 211
bn_BD: int_prefix: changing 88 to 880
dz_BT: int_prefix: changing 66 to 975
en_HK: int_prefix: changing to 852
en_PH: int_prefix: changing to 63
en_SG: int_prefix: changing to 65
es_DO: int_prefix: changing 1809 to 1
es_PA: int_prefix: changing 502 to 507
es_PR: int_prefix: changing 1787 to 1
km_KH: int_prefix: changing 856 to 855
mt_MT: int_prefix: changing to 356
ne_NP: int_prefix: changing 91 to 977
pap_AW: int_prefix: changing 599 to 297
the_NP: int_prefix: changing 91 to 977
tk_TM: int_prefix: changing to 993
uz_UZ: int_prefix: changing 27 to 998
zh_SG: int_prefix: changing to 65
I've also checked these against https://countrycode.org/.
Note: the Dominican Republic (DO) and Puerto Rico (PR) updates are
correct: they both use +1. Historically, DO had one area code of
809 and PR of 787 which is why they were listed as such, but they
have both expanded into 829 and 989 respectively, so using the four
digit value is def incorrect now.
The only official source is the "Official spelling dictionary of the
Bulgarian language, Prosveta 2012", which states there are three ways
to separate time components: comma, colon and dot. That same dictionary
doesn't say which one is preferred.
So I turned to the mailing list of the translators of free software in
Bulgarian. The consensus is that colon is the only separator that is
widely used in Bulgarian texts and everything else will just be confusing.
URL: http://lists.ludost.net/pipermail/dict/2015-December/000538.html
This patch makes the automation of Unicode LC_CTYPE generation also
support generating the modified LC_CTYPE used for Turkish (where case
conversions of 'i' and 'I' differ from ASCII conventions), so allowing
that to be more readily kept in sync for future Unicode updates. The
patch includes the locale update generated by the scripts.
Tested for x86_64.
[BZ #18491]
* unicode-gen/unicode_utils.py (to_upper_turkish): New function.
(to_lower_turkish): Likewise.
* unicode-gen/gen_unicode_ctype.py (output_tables): Support
producing output with Turkish case conversions.
(--turkish): New command-line option.
* unicode-gen/Makefile (GENERATED): Add tr_TR.
(tr_TR): New rule.
* locales/tr_TR: Regenerate LC_CTYPE.
Update __STDC_ISO_10646__ to 201505L for Unicode 8.0.0.
Update character encoding, ctype, and transliteration tables.
New scripts autogenerate transliteration tables.
- Remove duplicate transliterations for U+0152 and U+0153 from
C-translit.h.in.
- Change Ö U+00D6 LATIN CAPITAL LETTER O WITH STROKE → O
(instead of → OE)
- Change ö U+00F6 LATIN SMALL LETTER O WITH STROKE → o
(instead of → oe)
- Add ₹ U+20B9 INDIAN RUPEE SIGN → INR
- Add ₫ U+20AB DONG SIGN → Dong (in addition to "₫ → Đồng")
- Add many others from
http://unicode.org/cldr/trac/browser/trunk/common/transforms/Latin-ASCII.xml
- Add some more currency signs suggested by Marko Myllynen
- Add another patch with more characters by Marko Myllynen
In preparation to fix the --localedir configure argument we must
move the existing conflicting definition of localedir to a more
appropriate name. Given that all current internal uses of localedir
relate to the compiled locales we rename to complocaledir.
The previous (11th) version of the Hungarian spelling rules (released
in 1984) said that the separator had to be a dot, e.g. 10.35 meaning
10 o'clock 35 minutes. glibc correctly implements this.
The brand new (12th) version, in effect since September 1, 2015 adopts
to the common use of colon (especially in the digital world) and
allows to use either separator, without even expressing a preference.
For computer systems, using colons is way more typical and probably
easier to recognize. Dot is typically used in printed materials.
It also avoids an almost ambiguous situation where a space makes a
difference, e.g. "10.15-ig" means "until 10 o'clock 15 minutes"
whereas "10. 15-ig" means "until 15th of October". So I believe using
the colon as the separator is not only more frequent in the computer
world, but is also easier and quicker to recognize for the brain that
it's about hour:minute rather than month and day. And luckily it's now
equally correct according to the official rules.
11th edition: http://helyesiras.mta.hu/helyesiras/default/akh11
12th edition: http://helyesiras.mta.hu/helyesiras/default/akh12
In both editions it's the very last (299th and 300th, respectively) rule.
Microsoft also uses and recommends a colon since at least May 2011:
http://download.microsoft.com/download/e/6/1/e61266b2-d8b4-4fe0-a553-f01dc3976675/hun-hun-StyleGuide.pdf
The time format is different in common language and in the language of
IT. In common texts we usually do not abbreviate, so the full forms are
used: “7 óra 10 perckor csörgött a telefon”. However, the short format,
consisting of numerals only, can also be used. In this case a period
must be used between the two numbers and there must not be a space
between them: “találkozzunk 10.45-kor”.
However, in software mostly the short format is used, and the numbers
are separated by a colon. An obvious example is the clock in the bottom
right corner of your screen, thus 18:31.
lang_lib (which reflects ISO 639-2/B (bibliographic) codes) and
lang_term (which reflects ISO 639-2/T (terminology) codes) should be
identical except for those languages for which ISO 639-2 specifies
separate bibliographic/terminology values.
I used this Library of Congress page as the source:
http://www.loc.gov/standards/iso639-2/php/code_list.php
as discussed in the thread starting at
https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html
it looks like the best options is to remove locale timezone information
from locales which currently provide it (in incomplete or incorrect
fashion) rather than to start duplicating tzdata info in glibc.
repertoire maps and character mnemonics were used early in the glibc
i18n/l10n effort but were quickly deprecated in favor of Unicode code
points. According to ChangeLog, the in-tree repertoire maps were
removed 2000-07-07 but some stray references remain even today. The
patch below removes them.
After renaming localedef now complains and build fails
LC_ADDRESS: field `lang_ab' must not be defined
earlier the names were similar to lang_ab definitions 'tu' or 'bh'
but after rename they are not.
Bhili [1] and Tulu [2] language does not have iso-639-1 codes. Patch
moves locale file with correct code and also fix iso-639.def.
1. http://www-01.sil.org/iso639-3/documentation.asp?id=bhb
2. http://www-01.sil.org/iso639-3/documentation.asp?id=tcy
localedata/ChangeLog:
2015-07-02 Pravin Satpute <psatpute@redhat.com>
[BZ #17475]
* locales/tu_IN: renamed to tcy_IN
* locales/bh_IN: renamed to bhb_IN
Changelog:
2015-03-05 Pravin Satpute <psatpute@redhat.com>
[BZ #17475]
* locale/iso-639.def: Update Bhili and Tulu language codes as
per iso639-3.
These tests were skipped by the use-test-skeleton conversion done in
commit 29955b5d because they were reused in other tests via the #include
directive, and so deemed worth an inspection before they were modified.
This has now been done.
ChangeLog:
2015-07-09 Arjun Shankar <arjun.is@lostca.se>
* elf/tst-leaks1.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* localedata/tst-langinfo.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* math/test-fpucw.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* math/test-tgmath.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* math/test-tgmath2.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* setjmp/tst-setjmp.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* stdio-common/tst-sscanf.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
* sysdeps/x86_64/tst-audit6.c (main): Converted to ...
(do_test): ... this.
(TEST_FUNCTION): New macro.
Include test-skeleton.c.
In the introduction for the official orthography rules for Ukrainian
language (http://spelling.ulif.org.ua/peredmova.htm) there's a note
that only apostrophe does not affect order of the words when sorting.
As could be seen from the official alphabet the soft sign
(U+044C/U+042C) has its hard position and thus affects the order and
also letters "е" and "є" (CYR-IE: U+0435/U+0415 and UKR-IE:
U+0454/U+0404) have their own positions and should have separate place
when sorting.
This also corresponds to official Unicode collation chart for these
letters: http://unicode.org/charts/collation/chart_Cyrillic.html
Both bo_CN and bo_IN were not compiling. The following fix
gets them into a usable state again giving a clean build
result for `make localedata/install-locales`.
This patch prepares for the strcoll benchmark by moving the makefile
code for generating the locale files into a standalone snippet that
can be used elsewhere.
This patch adjusts the expected currency symbol kr to kr. after commit
"Update currency_symbol in da_DK"
(92566b4922) which changed it.
* tst-strfmon1.c (tests): Update expected currency symbol.
[BZ #18206]
* wcsmbs/wcsncmp.c (wcsncmp): Compare as wchar_t, not wint_t.
Use signed comparision instead of substraction to avoid
overflow bug.
* localedata/tests-mbwc/tst_wcsncmp.c (tst_wcsncmp):
Take the sign of ret.
* localedata/tests-mbwc/dat_wcsncmp.c (tst_wcsncmp_loc):
Do not expect precise return values. Only the sign matters.
* wcsmbs/Makefile (strop-tests): Add wcsncmp.
* wcsmbs/test-wcsncmp.c: New File.
* string/test-strncmp.c: Add wcsncmp support.
commit 9781a37002 changed the expected
results for mbrlen in case of passing n=0 to -2. The initialization of
tst_mbrlen_loc and tst_mbrtowc should be updated accordingly.
* tests-mbwc/dat_mbrlen.c (tst_mbrlen_loc): Change expected
result to -2 in case of n == 0.
* tests-mbwc/tst_mbrtowc.c (tst_mbrtowc): Check result against
-2 instead of 0.
for ChangeLog
* include/stdc-predef.h (__STDC_ISO_10646__): Update to
201304L, for Unicode 7.
for localedata/ChangeLog
* unicode-gen/ctype_compatibility.py: Use date ranges in
copyright notice.
* unicode-gen/ctype_compatibility_test_cases.py: Likewise.
* unicode-gen/gen_unicode_ctype.py: Likewise.
* unicode-gen/utf8_compatibility.py: Likewise.
* unicode-gen/utf8_gen.py: Likewise. Use upper case for
global variables, use tuples for global constant arrays. From
Mike FABIAN. Suggested by Mike Frysinger <vapier@gentoo.org>.
for localedata/ChangeLog
[BZ #17588]
[BZ #13064]
[BZ #14094]
[BZ #17998]
* unicode-gen/Makefile: New.
* unicode-gen/unicode-license.txt: New, from Unicode.
* unicode-gen/UnicodeData.txt: New, from Unicode.
* unicode-gen/DerivedCoreProperties.txt: New, from Unicode.
* unicode-gen/EastAsianWidth.txt: New, from Unicode.
* unicode-gen/gen_unicode_ctype.py: New generator, from Mike
FABIAN <mfabian@redhat.com>.
* unicode-gen/ctype_compatibility.py: New verifier, from
Pravin Satpute <psatpute@redhat.com> and Mike FABIAN.
* unicode-gen/ctype_compatibility_test_cases.py: New verifier
module, from Mike FABIAN.
* unicode-gen/utf8_gen.py: New generator, from Pravin Satpute
and Mike FABIAN.
* unicode-gen/utf8_compatibility.py: New verifier, from Pravin
Satpute and Mike FABIAN.
* charmaps/UTF-8: Update.
* locales/i18n: Update.
* gen-unicode-ctype.c: Remove.
* tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns
true for ordinal indicators.
[Modified from the original email by Siddhesh Poyarekar]
This patch solves bug #16009 by implementing an additional path in
strxfrm that does not depend on caching the weight and rule indices.
In detail the following changed:
* The old main loop was factored out of strxfrm_l into the function
do_xfrm_cached to be able to alternativly use the non-caching version
do_xfrm.
* strxfrm_l allocates a a fixed size array on the stack. If this is not
sufficiant to store the weight and rule indices, the non-caching path is
taken. As the cache size is not dependent on the input there can be no
problems with integer overflows or stack allocations greater than
__MAX_ALLOCA_CUTOFF. Note that malloc-ing is not possible because the
definition of strxfrm does not allow an oom errorhandling.
* The uncached path determines the weight and rule index for every char
and for every pass again.
* Passing all the locale data array by array resulted in very long
parameter lists, so I introduced a structure that holds them.
* Checking for zero src string has been moved a bit upwards, it is
before the locale data initialization now.
* To verify that the non-caching path works correct I added a test run
to localedata/sort-test.sh & localedata/xfrm-test.c where all strings
are patched up with spaces so that they are too large for the caching path.
Allow building tests in a cross configuration without a test wrapper
defined. This is helpful for doing simple build testing of tests.
ChangeLog:
2014-09-30 Will Newton <will.newton@linaro.org>
* localedata/Makefile: Move assignment to tests-special
into an ifdef testing run-built-tests.
* timezone/Makefile: Likewise.
The glibc makefiles have a standard variable, $(rtld-prefix), to run
the dynamic linker with a default --library-path option; this is used
as the basis of lots of other variables for running programs compiled
with the newly built library.
A few places however use $(elf-objpfx)ld.so or
$(elf-objpfx)${rtld-installed-name} directly, with such a
--library-path option. This patch makes such places use
$(rtld-prefix) instead. I'm not aware of any significance in these
cases to the choice of ld.so or ${rtld-installed-name} when running
the dynamic linker, or to whether $(patsubst
%,:%,$(sysdep-library-path)) is included in the library-path as it is
in $(rtld-prefix) and just one of the places being changed.
Tested x86_64.
* elf/Makefile ($(objpfx)tst-unused-dep.out): Use $(rtld-prefix).
* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Likewise.
* sysdeps/s390/s390-64/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Likewise.
localedata/ChangeLog:
* Makefile (LOCALEDEF): Use $(rtld-prefix).
The localedata tests tst-mbswcs and tst-wctype use custom .sh scripts
and makefile rules, but have no need to do so. tst-mbswcs.sh runs a
series of test programs in succession (and nothing special is done
with the output of the programs); this patch makes the separate tests
into ordinary tests run directly by the usual makefile rules.
tst-wctype.sh runs one test with an environment variable and input
redirection; generic makefile rules also cover that, so again this
patch converts it into an ordinary test. (The makefile dependency of
tst-wctype.out on sort-test.out that this patch removes appears to be
a cut-and-paste error; the test does not appear to use that file.
There is already a generic dependency of ordinary tests in this
directory on $(addprefix $(objpfx),$(CTYPE_FILES)).)
Tested x86_64.
localedata/ChangeLog:
* Makefile (test-srcs): Remove tst-mbswcs1, tst-mbswcs2,
tst-mbswcs3, tst-mbswcs4, tst-mbswcs5 and tst-wctype.
(generated): Remove tst-mbswcs.out.
(tests): Add tst-mbswcs1, tst-mbswcs2, tst-mbswcs3, tst-mbswcs4,
tst-mbswcs5 and tst-wctype.
(tests-special): Remove $(objpfx)tst-mbswcs.out and
$(objpfx)tst-wctype.out.
($(objpfx)tst-mbswcs.out): Remove rule.
($(objpfx)tst-wctype.out): Likewise.
(tst-wctype-ENV): New variable.
* tst-mbswcs.sh: Remove file.
* tst-wctype.sh: Likewise.
Various glibc build / install / test code has C locale settings that
are redundant with LC_ALL=C.
LC_ALL takes precedence over LANG, so anywhere that sets LC_ALL=C
(explicitly, or through it being in the default environment for
running tests) does not need to set LANG=C. LC_ALL=C also takes
precedence over LANGUAGE, since
2001-01-02 Ulrich Drepper <drepper@redhat.com>
* intl/dcigettext.c (guess_category_value): Rewrite so that LANGUAGE
value is ignored if the selected locale is the C locale.
* intl/tst-gettext.c: Set locale for above change.
* intl/tst-translit.c: Likewise.
and so settings of LANGUAGE=C are also redundant when LC_ALL=C is
set. One test also had LC_ALL=C in its -ENV setting, although it's
part of the default environment used for tests.
This patch removes the redundant settings. It removes a suggestion in
install.texi of setting LANGUAGE=C LC_ALL=C for "make install"; the
Makefile.in target "install" already sets LC_ALL_C so there's no need
for the user to set it (and nor should there be any need for the user
to set it).
If some build machine tool used by "make install" uses a version of
libintl predating that 2001 change, and the user has LANGUAGE set, the
removal of LANGUAGE=C from the Makefile.in "install" rule could in
principle affect the user's installation. However, I don't think we
need to be concerned about pre-2001 build tools.
Tested x86_64.
* Makefile (install): Don't set LANGUAGE.
* Makefile.in (install): Likewise.
* assert/Makefile (test-assert-ENV): Remove variable.
(test-assert-perr-ENV): Likewise.
* elf/Makefile (neededtest4-ENV): Likewise.
* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Don't set LANGUAGE.
* io/ftwtest-sh (LANG): Remove variable.
* libio/Makefile (tst-widetext-ENV): Likewise.
* manual/install.texi (Running make install): Don't refer to
environment settings for make install.
* INSTALL: Regenerated.
* nptl/tst-tls6.sh: Don't set LANG.
* posix/globtest.sh (LANG): Remove variable.
* string/Makefile (tester-ENV): Likewise.
(inl-tester-ENV): Likewise.
(noinl-tester-ENV): Likewise.
* sysdeps/s390/s390-64/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Don't set LANGUAGE.
* timezone/Makefile (build-testdata): Use $(built-program-cmd)
without explicit environment settings.
localedata/ChangeLog:
* tst-fmon.sh: Don't set LANGUAGE.
* tst-locale.sh: Likewise.
One wart in the original support for test wrappers for cross testing,
as noted in
<https://sourceware.org/ml/libc-alpha/2012-10/msg00722.html>, is the
requirement for test wrappers to pass a poorly-defined set of
environment variables from the build system to the system running the
glibc under test. Although some variables are passed explicitly via
$(test-wrapper-env), including LD_* variables that simply can't be
passed implicitly because of the side effects they'd have on the build
system's dynamic linker, others are passed implicitly, including
variables such as GCONV_PATH and LOCPATH that could potentially affect
the build system's libc (so effectively relying on any such effects
not breaking the wrappers). In addition, the code in
cross-test-ssh.sh for preserving environment variables is fragile (it
depends on how bash formats a list of exported variables, and could
well break for multi-line variable definitions where the contents
contain things looking like other variable definitions).
This patch moves to explicitly passing environment variables via
$(test-wrapper-env). Makefile variables that previously used
$(test-wrapper) are split up into -before-env and -after-env parts
that can be passed separately to the various .sh files used in
testing, so those files can then insert environment settings between
the two parts.
The common default environment settings in make-test-out are made into
a separate makefile variable that can also be passed to scripts,
rather than many scripts duplicating those settings (for testing an
installed glibc, it is desirable to have the GCONV_PATH setting on
just one place, so just that one place needs to support it pointing to
an installed sysroot instead of the build tree). The default settings
are included in the variables such as $(test-program-prefix), so that
if tests do not need any non-default settings they can continue to use
single variables rather than the split-up variables.
Although this patch cleans up LC_ALL=C settings (that being part of
the common defaults), various LANG=C and LANGUAGE=C settings remain.
Those are generally unnecessary and I propose a subsequent cleanup to
remove them. LC_ALL takes precedence over LANG, and while LANGUAGE
takes precedence over LC_ALL, it only does so for settings other than
LC_ALL=C. So LC_ALL=C on its own is sufficient to ensure the C
locale, and anything that gets LC_ALL=C does not need the other
settings.
While preparing this patch I noticed some tests with .sh files that
appeared to do nothing beyond what the generic makefile support for
tests can do (localedata/tst-wctype.sh - the makefiles support -ENV
variables and .input files - and localedata/tst-mbswcs.sh - just runs
five tests that could be run individually from the makefile). So I
propose another subsequent cleanup to move those to using the generic
support instead of special .sh files.
Tested x86_64 (native) and powerpc32 (cross).
* Makeconfig (run-program-env): New variable.
(run-program-prefix-before-env): Likewise.
(run-program-prefix-after-env): Likewise.
(run-program-prefix): Define in terms of new variables.
(built-program-cmd-before-env): New variable.
(built-program-cmd-after-env): Likewise.
(built-program-cmd): Define in terms of new variables.
(test-program-prefix-before-env): New variable.
(test-program-prefix-after-env): Likewise.
(test-program-prefix): Define in terms of new variables.
(test-program-cmd-before-env): New variable.
(test-program-cmd-after-env): Likewise.
(test-program-cmd): Define in terms of new variables.
* Rules (make-test-out): Use $(run-program-env).
* scripts/cross-test-ssh.sh (env_blacklist): Remove variable.
(help): Do not mention environment variables. Mention
--timeoutfactor option.
(timeoutfactor): New variable.
(blacklist_exports): Remove function.
(exports): Remove variable.
(command): Do not include ${exports}.
* manual/install.texi (Configuring and compiling): Do not mention
test wrappers preserving environment variables. Mention that last
assignment to a variable must take precedence.
* INSTALL: Regenerated.
* benchtests/Makefile (run-bench): Use $(run-program-env).
* catgets/Makefile ($(objpfx)test1.cat): Use
$(built-program-cmd-before-env), $(run-program-env) and
$(built-program-cmd-after-env).
($(objpfx)test2.cat): Do not specify environment variables
explicitly.
($(objpfx)de/libc.cat): Use $(built-program-cmd-before-env),
$(run-program-env) and $(built-program-cmd-after-env).
($(objpfx)test-gencat.out): Use $(test-program-cmd-before-env),
$(run-program-env) and $(test-program-cmd-after-env).
($(objpfx)sample.SJIS.cat): Do not specify environment variables
explicitly.
* catgets/test-gencat.sh: Use test_program_cmd_before_env,
run_program_env and test_program_cmd_after_env arguments.
* elf/Makefile ($(objpfx)tst-pathopt.out): Use $(run-program-env).
* elf/tst-pathopt.sh: Use run_program_env argument.
* iconvdata/Makefile ($(objpfx)iconv-test.out): Use
$(test-wrapper-env) and $(run-program-env).
* iconvdata/run-iconv-test.sh: Use test_wrapper_env and
run_program_env arguments.
* iconvdata/tst-table.sh: Do not set GCONV_PATH explicitly.
* intl/Makefile ($(objpfx)tst-gettext.out): Use
$(test-program-prefix-before-env), $(run-program-env) and
$(test-program-prefix-after-env).
($(objpfx)tst-gettext2.out): Likewise.
* intl/tst-gettext.sh: Use test_program_prefix_before_env,
run_program_env and test_program_prefix_after_env arguments.
* intl/tst-gettext2.sh: Likewise.
* intl/tst-gettext4.sh: Do not set environment variables
explicitly.
* intl/tst-gettext6.sh: Likewise.
* intl/tst-translit.sh: Likewise.
* malloc/Makefile ($(objpfx)tst-mtrace.out): Use
$(test-program-prefix-before-env), $(run-program-env) and
$(test-program-prefix-after-env).
* malloc/tst-mtrace.sh: Use test_program_prefix_before_env,
run_program_env and test_program_prefix_after_env arguments.
* math/Makefile (run-regen-ulps): Use $(run-program-env).
* nptl/Makefile ($(objpfx)tst-tls6.out): Use $(run-program-env).
* nptl/tst-tls6.sh: Use run_program_env argument. Set LANG=C
explicitly with each use of ${test_wrapper_env}.
* posix/Makefile ($(objpfx)wordexp-tst.out): Use
$(test-program-prefix-before-env), $(run-program-env) and
$(test-program-prefix-after-env).
* posix/tst-getconf.sh: Do not set environment variables
explicitly.
* posix/wordexp-tst.sh: Use test_program_prefix_before_env,
run_program_env and test_program_prefix_after_env arguments.
* stdio-common/tst-printf.sh: Do not set environment variables
explicitly.
* stdlib/Makefile ($(objpfx)tst-fmtmsg.out): Use
$(test-program-prefix-before-env), $(run-program-env) and
$(test-program-prefix-after-env).
* stdlib/tst-fmtmsg.sh: Use test_program_prefix_before_env,
run_program_env and test_program_prefix_after_env arguments.
Split $test calls into $test_pre and $test.
* timezone/Makefile (build-testdata): Use
$(built-program-cmd-before-env), $(run-program-env) and
$(built-program-cmd-after-env).
localedata/ChangeLog:
* Makefile ($(addprefix $(objpfx),$(CTYPE_FILES))): Use
$(built-program-cmd-before-env), $(run-program-env) and
$(built-program-cmd-after-env).
($(objpfx)sort-test.out): Use $(test-program-prefix-before-env),
$(run-program-env) and $(test-program-prefix-after-env).
($(objpfx)tst-fmon.out): Use $(run-program-prefix-before-env),
$(run-program-env) and $(run-program-prefix-after-env).
($(objpfx)tst-locale.out): Use $(built-program-cmd-before-env),
$(run-program-env) and $(built-program-cmd-after-env).
($(objpfx)tst-trans.out): Use $(run-program-prefix-before-env),
$(run-program-env), $(run-program-prefix-after-env),
$(test-program-prefix-before-env) and
$(test-program-prefix-after-env).
($(objpfx)tst-ctype.out): Use $(test-program-cmd-before-env),
$(run-program-env) and $(test-program-cmd-after-env).
($(objpfx)tst-wctype.out): Likewise.
($(objpfx)tst-langinfo.out): Likewise.
($(objpfx)tst-langinfo-static.out): Likewise.
* gen-locale.sh: Use localedef_before_env, run_program_env and
localedef_after_env arguments.
* sort-test.sh: Use test_program_prefix_before_env,
run_program_env and test_program_prefix_after_env arguments.
* tst-ctype.sh: Use tst_ctype_before_env, run_program_env and
tst_ctype_after_env arguments.
* tst-fmon.sh: Use run_program_prefix_before_env, run_program_env
and run_program_prefix_after_env arguments.
* tst-langinfo.sh: Use tst_langinfo_before_env, run_program_env
and tst_langinfo_after_env arguments.
* tst-locale.sh: Use localedef_before_env, run_program_env and
localedef_after_env arguments.
* tst-mbswcs.sh: Do not set environment variables explicitly.
* tst-numeric.sh: Likewise.
* tst-rpmatch.sh: Likewise.
* tst-trans.sh: Use run_program_prefix_before_env,
run_program_env, run_program_prefix_after_env,
test_program_prefix_before_env and test_program_prefix_after_env
arguments.
* tst-wctype.sh: Use tst_wctype_before_env, run_program_env and
tst_wctype_after_env arguments.
Tests run using the default $(make-test-out) automatically get
GCONV_PATH and LC_ALL set, whether or not those environment variables
are actually needed for the individual test. However, they do not get
LOCPATH set, meaning that a large number of tests have -ENV settings
just to set LOCPATH.
This patch moves LOCPATH into the default environment used for all
tests, on the principle that like GCONV_PATH any settings needed to
use files associated with the newly built library, rather than any old
installed files, are appropriate to use by default.
A further motivation is that various tests using .sh files also set
some combination of LC_ALL, GCONV_PATH and LOCPATH. Preferably .sh
files should also use the default environment with any additions
required for the individual test. Now, it was suggested in
<https://sourceware.org/ml/libc-alpha/2014-05/msg00715.html> that
various Makefile variables used in testing should be derived by
composing the -before-env and -after-env variables used when explicit
environment settings are required. With such a change, it's also
natural for those variables to include the default settings (via some
intermediate makefile variable also used in make-test-out).
Because some .sh files only set variables that correspond to the
default settings, or a subset thereof, and this applies to more of the
.sh files once LOCPATH is in the default settings, doing so reduces
the size of a revised version of
<https://sourceware.org/ml/libc-alpha/2014-05/msg00596.html>: scripts
only needing the (expanded) default settings will not need to receive
the separate -before-env and -after-env variables, only the single
variable they do at present. So moving LOCPATH into the default
settings can reduce churn caused by subsequent patches.
Tested x86_64 and x86.
* Rules (make-test-out): Include
LOCPATH=$(common-objpfx)localedata in default environment.
* debug/Makefile (tst-chk1-ENV): Remove variable.
(tst-chk2-ENV): Likewise.
(tst-chk3-ENV): Likewise.
(tst-chk4-ENV): Likewise.
(tst-chk5-ENV): Likewise.
(tst-chk6-ENV): Likewise.
(tst-lfschk1-ENV): Likewise.
(tst-lfschk2-ENV): Likewise.
(tst-lfschk3-ENV): Likewise.
(tst-lfschk4-ENV): Likewise.
(tst-lfschk5-ENV): Likewise.
(tst-lfschk6-ENV): Likewise.
* iconvdata/Makefile (bug-iconv6-ENV): Likewise.
(tst-iconv7-ENV): Likewise.
* intl/Makefile (LOCPATH-ENV): Likewise.
(tst-codeset-ENV): Likewise.
(tst-gettext3-ENV): Likewise.
(tst-gettext5-ENV): Likewise.
* libio/Makefile (tst-widetext-ENV): Don't set LOCPATH.
(tst-fopenloc-ENV): Likewise.
(tst-fgetws-ENV): Remove variable.
(tst-ungetwc1-ENV): Likewise.
(tst-ungetwc2-ENV): Likewise.
(bug-ungetwc2-ENV): Likewise.
(tst-swscanf-ENV): Likewise.
(bug-ftell-ENV): Likewise.
(tst-fgetwc-ENV): Likewise.
(tst-fseek-ENV): Likewise.
(tst-ftell-partial-wide-ENV): Likewise.
(tst-ftell-active-handler-ENV): Likewise.
(tst-ftell-append-ENV): Likewise.
* posix/Makefile (tst-fnmatch-ENV): Likewise.
(tst-regexloc-ENV): Likewise.
(bug-regex1-ENV): Likewise.
(tst-regex-ENV): Likewise.
(tst-regex2-ENV): Likewise.
(bug-regex5-ENV): Likewise.
(bug-regex6-ENV): Likewise.
(bug-regex17-ENV): Likewise.
(bug-regex18-ENV): Likewise.
(bug-regex19-ENV): Likewise.
(bug-regex20-ENV): Likewise.
(bug-regex22-ENV): Likewise.
(bug-regex23-ENV): Likewise.
(bug-regex25-ENV): Likewise.
(bug-regex26-ENV): Likewise.
(bug-regex30-ENV): Likewise.
(bug-regex32-ENV): Likewise.
(bug-regex33-ENV): Likewise.
(bug-regex34-ENV): Likewise.
(bug-regex35-ENV): Likewise.
(tst-rxspencer-ENV): Likewise.
(tst-rxspencer-no-utf8-ENV): Likewise.
* stdio-common/Makefile (tst-sprintf-ENV): Likewise.
(tst-sscanf-ENV): Likewise.
(tst-swprintf-ENV): Likewise.
(tst-swscanf-ENV): Likewise.
(test-vfprintf-ENV): Likewise.
(scanf13-ENV): Likewise.
(bug14-ENV): Likewise.
(tst-grouping-ENV): Likewise.
* stdlib/Makefile (tst-strtod-ENV): Likewise.
(tst-strtod3-ENV): Likewise.
(tst-strtod4-ENV): Likewise.
(tst-strtod5-ENV): Likewise.
(testmb2-ENV): Likewise./
* string/Makefile (tst-strxfrm-ENV): Likewise.
(tst-strxfrm2-ENV): Likewise.
(bug-strcoll1-ENV): Likewise.
(test-strcasecmp-ENV): Likewise.
(test-strncasecmp-ENV): Likewise.
* time/Makefile (tst-strptime-ENV): Likewise.
(tst-ftime_l-ENV): Likewise.
* wcsmbs/Makefile (tst-btowc-ENV): Likewise.
(tst-mbrtowc-ENV): Likewise.
(tst-wcrtomb-ENV): Likewise.
(tst-mbrtowc2-ENV): Likewise.
(tst-c16c32-1-ENV): Likewise.
(tst-mbsnrtowcs-ENV): Likewise.
localedata/ChangeLog:
* Makefile (TEST_MBWC_ENV): Remove variable.
(tst_iswalnum-ENV): Likewise.
(tst_iswalpha-ENV): Likewise.
(tst_iswcntrl-ENV): Likewise.
(tst_iswctype-ENV): Likewise.
(tst_iswdigit-ENV): Likewise.
(tst_iswgraph-ENV): Likewise.
(tst_iswlower-ENV): Likewise.
(tst_iswprint-ENV): Likewise.
(tst_iswpunct-ENV): Likewise.
(tst_iswspace-ENV): Likewise.
(tst_iswupper-ENV): Likewise.
(tst_iswxdigit-ENV): Likewise.
(tst_mblen-ENV): Likewise.
(tst_mbrlen-ENV): Likewise.
(tst_mbrtowc-ENV): Likewise.
(tst_mbsrtowcs-ENV): Likewise.
(tst_mbstowcs-ENV): Likewise.
(tst_mbtowc-ENV): Likewise.
(tst_strcoll-ENV): Likewise.
(tst_strfmon-ENV): Likewise.
(tst_strxfrm-ENV): Likewise.
(tst_swscanf-ENV): Likewise.
(tst_towctrans-ENV): Likewise.
(tst_towlower-ENV): Likewise.
(tst_towupper-ENV): Likewise.
(tst_wcrtomb-ENV): Likewise.
(tst_wcscat-ENV): Likewise.
(tst_wcschr-ENV): Likewise.
(tst_wcscmp-ENV): Likewise.
(tst_wcscoll-ENV): Likewise.
(tst_wcscpy-ENV): Likewise.
(tst_wcscspn-ENV): Likewise.
(tst_wcslen-ENV): Likewise.
(tst_wcsncat-ENV): Likewise.
(tst_wcsncmp-ENV): Likewise.
(tst_wcsncpy-ENV): Likewise.
(tst_wcspbrk-ENV): Likewise.
(tst_wcsrtombs-ENV): Likewise.
(tst_wcsspn-ENV): Likewise.
(tst_wcsstr-ENV): Likewise.
(tst_wcstod-ENV): Likewise.
(tst_wcstok-ENV): Likewise.
(tst_wcstombs-ENV): Likewise.
(tst_wcswidth-ENV): Likewise.
(tst_wcsxfrm-ENV): Likewise.
(tst_wctob-ENV): Likewise.
(tst_wctomb-ENV): Likewise.
(tst_wctrans-ENV): Likewise.
(tst_wctype-ENV): Likewise.
(tst_wcwidth-ENV): Likewise.
(tst-digits-ENV): Likewise.
(tst-mbswcs6-ENV): Likewise.
(tst-xlocale1-ENV): Likewise.
(tst-xlocale2-ENV): Likewise.
(tst-strfmon1-ENV): Likewise.
(tst-strptime-ENV): Likewise.
(tst-setlocale-ENV): Don't set LOCPATH.
(bug-iconv-trans-ENV): Remove variable.
(tst-sscanf-ENV): Likewise.
(tst-leaks-ENV): Don't set LOCPATH.
(bug-setlocale1-ENV): Remove variable.
(bug-setlocale1-static-ENV): Likewise.
(tst-setlocale2-ENV): Likewise.
As previously noted
<https://sourceware.org/ml/libc-alpha/2013-05/msg00696.html>,
$(elf-objpfx) and $(elfobjdir) are redundant and should be
consolidated. This patch consolidates on $(elf-objpfx) (for
consistency with $(csu-objpfx)), also changing direct uses of
$(common-objpfx)elf/ to use $(elf-objpfx).
Tested x86_64, including that installed shared libraries are unchanged
by the patch.
* Makeconfig [$(build-hardcoded-path-in-tests) = yes]
(rtld-tests-LDFLAGS): Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
(link-libc-before-gnulib): Likewise.
(elfobjdir): Remove variable.
* Makefile (install): Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
* Makerules (link-libc-args): Use $(elf-objpfx) instead of
$(elfobjdir)/.
(link-libc-deps): Likewise.
($(common-objpfx)libc.so): Likewise.
($(common-objpfx)linkobj/libc.so): Likewise.
[$(cross-compiling) = no] (symbolic-link-prog): Use $(elf-objpfx)
instead of $(common-objpfx)elf/.
(symbolic-link-list): Likewise.
* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Likewise.
* sysdeps/arm/Makefile (gnulib-arch): Use $(elf-objpfx) instead of
$(elfobjdir)/.
(static-gnulib-arch): Likewise.
* sysdeps/s390/s390-64/Makefile ($(inst_gconvdir)/gconv-modules)
[$(cross-compiling) = no]: Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
localedata/ChangeLog:
* Makefile (LOCALEDEF): Use $(elf-objpfx) instead of
$(common-objpfx)elf/.
For static linking the locale code avoids linking code and data for
unused categories. However for nl_langinfo we know only at runtime which
categories are used, so direct reference to every nl_current_CATEGORY
symbol should be done.
This was broken by commit bc3e1c1273 where
nl_langinfo_l and nl_langinfo have been merged and some code has been
lost in the process.
In order to detect locales issues with static linking, compile a version
of tst-langinfo with static linking.
Note: this is Debian bug#747103 reported by Raphael <raphael.astier@eliot-sa.com>
In <https://sourceware.org/ml/libc-alpha/2014-01/msg00198.html> I
raised the question of counting miscellaneous dependencies of tests,
built on the host rather than the build system, as tests, so that when
test failures don't stop "make check" neither do those other builds on
the host, so that a flaky host doesn't stop "make check" from
producing a complete summary of test results. Brooks supported that
idea in <https://sourceware.org/ml/libc-alpha/2014-02/msg00301.html>.
This patch implements that change for all the examples I could find:
one message catalog in catgets/, locales in localedata/ and timezone
files in timezone/.
Tested x86_64.
* catgets/Makefile (tests-special): Add $(objpfx)sample.SJIS.cat.
($(objpfx)sample.SJIS.cat): Use $(evaluate-test).
* timezone/Makefile (testdata): Move definition above include of
Rules.
(test-zones): New variable.
(tests-special): Add zone files.
(build-testdata): Use $(evaluate-test).
localedata/ChangeLog:
* Makefile (LOCALES): Move definition above include of Rules.
(LOCALE_SRCS): Likewise.
(CHARMAPS): Likewise.
(CTYPE_FILES): Likewise.
(tests-special): Add locale files.
($(addprefix $(objpfx),$(CTYPE_FILES))): Use $(evaluate-test).
This patch systematically renames miscellaneous tests so their outputs
use a *.out name (unless the test is just running some glibc program
with its conventional output file name, rather than a special program
at all, as in catgets tests generating *.cat). In the case of the
iconv test test-iconvconfig, output is redirected where it wasn't
before.
In various places the "generated" variable is updated to reflect the
revised test names; in iconvdata/Makefile a typo (mmtrace-tst-loading)
is also fixed. resolv/Makefile sets both "generate" (which appears
unused) and "generated". Bitrot in the settings of these variables
could no doubt be fixed so that "make clean" after build and testing
leaves results the same as after configure (and indeed the
tests-special / xtests-special variables could be used to simplify
things, by removing those files automatically rather than listing them
manually in these variables), and "make distclean" leaves an empty
build directory, but right now it appears various files don't get
deleted. I think they are liable to continue to bitrot in the absence
of routine testing that these targets actually work, given that
building in the source directory isn't supported and that was the main
use of such makefile targets.
Tested x86_64.
* elf/Makefile (tests-special): Rename tests to end with .out.
($(objpfx)noload-mem): Likewise.
($(objpfx)tst-leaks1-mem): Likewise.
($(objpfx)tst-leaks1-static-mem.out): Likewise.
* iconv/Makefile (xtests-special): Change test-iconvconfig to
$(objpfx)test-iconvconfig.out.
(test-iconvconfig): Change to $(objpfx)test-iconvconfig.out. Use
set -e inside subshell and redirect output to file.
* iconvdata/Makefile (generated): Rename tests to end with .out.
Correct type.
(tests-special): Rename tests to end with .out.
($(objpfx)mtrace-tst-loading): Likewise.
* intl/Makefile (generated): Likewise.
(tests-special): Likewise.
($(objpfx)mtrace-tst-gettext): Likewise.
* misc/Makefile (generated): Likewise.
(tests-special): Likewise.
($(objpfx)tst-error1-mem): Likewise.
* nptl/Makefile (tests-special): Likewise.
($(objpfx)tst-stack3-mem): Likewise.
(generated): Likewise.
* posix/Makefile (generated): Likewise.
(tests-special): Likewise.
(xtests-special): Likewise.
($(objpfx)tst-fnmatch-mem): Likewise.
($(objpfx)bug-regex2-mem): Likewise.
($(objpfx)bug-regex14-mem): Likewise.
($(objpfx)bug-regex21-mem): Likewise.
($(objpfx)bug-regex31-mem): Likewise.
($(objpfx)tst-vfork3-mem): Likewise.
($(objpfx)tst-rxspencer-no-utf8-mem): Likewise.
($(objpfx)tst-pcre-mem): Likewise.
($(objpfx)tst-boost-mem): Likewise.
($(objpfx)bug-ga2-mem): Likewise.
($(objpfx)bug-glob2-mem): Likewise.
* resolv/Makefile (generate): Likewise.
(tests-special): Likewise.
(xtests-special): Likewise.
(generated): Likewise.
($(objpfx)mtrace-tst-leaks): Likewise.
($(objpfx)mtrace-tst-leaks2): Likewise.
localedata:
* Makefile (generated): Rename tests to end with .out.
(tests-special): Likewise.
($(objpfx)mtrace-tst-leaks): Likewise.
This patch is a revised and updated version of
<https://sourceware.org/ml/libc-alpha/2014-01/msg00196.html>.
In order to generate overall summaries of the results of all tests in
the glibc testsuite, we need to identify and concatenate the files
with the results of individual tests.
Tomas Dohnalek's patch used $(common-objpfx)*/*.test-result for this.
However, the normal glibc approach is explicit enumeration of the
expected set of files with a given property, rather than all files
matching some pattern like that. Furthermore, we would like to be
able to mark tests as UNRESOLVED if the file with their results is for
some reason missing, and in future we would like to be able to mark
tests as UNSUPPORTED if they are disabled for a particular
configuration (rather than simply having them missing from the list of
tests as at present). Such handling of tests that were not run or did
not record results requires an explicit enumeration of tests.
For the tests following the default makefile rules, $(tests) (and
$(xtests)) provides such an enumeration. Others, however, are added
directly as dependencies of the "tests" and "xtests" makefile
targets. This patch changes the makefiles to put them in variables
tests-special and xtests-special, with appropriate dependencies on the
tests listed there then being added centrally.
Those variables are used in Rules and so need to be set before Rules
is included in a subdirectory makefile, which is often earlier in the
makefile than the dependencies were present before. We previously
discussed the question of where to include Rules; see the question at
<https://sourceware.org/ml/libc-alpha/2012-11/msg00798.html>, and a
discussion in
<https://sourceware.org/ml/libc-alpha/2013-01/msg00337.html> of why
Rules is included early rather than late in subdirectory makefiles.
It was necessary to avoid an indirection through the check-abi target
and get the check-abi-* targets for individual libraries into the
tests-special variable. The intl/ test $(objpfx)tst-gettext.out,
previously built only because of dependencies from other tests, was
also added to tests-special for the same reason.
The entries in tests-special are the full makefile targets, complete
with $(objpfx) and .out. If a future change causes tests to be named
consistently with a .out suffix, this can be changed to include just
the path relative to $(objpfx), without .out.
Tested x86_64, including that the same set of files is generated in
the build directory by a build and testsuite run both before and after
the patch (except for changes to the
elf/tst-null-argv.debug.out.<number> file name), and a build with
run-built-tests=no to verify there aren't any more obvious instances
of the issue Marcus Shawcroft reported with a previous version in
<https://sourceware.org/ml/libc-alpha/2014-01/msg00462.html>.
* Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(tests): Depend on $(tests-special).
* Makerules (check-abi-list): New variable.
(check-abi): Depend on $(check-abi-list).
[$(subdir) = elf] (tests-special): Add
$(objpfx)check-abi-libc.out.
[$(build-shared) = yes && subdir] (tests-special): Add
$(check-abi-list).
[$(build-shared) = yes && subdir] (tests): Do not depend on
check-abi.
* Rules (tests): Depend on $(tests-special).
(xtests): Depend on $(xtests-special).
* catgets/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* conform/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* elf/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* grp/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* iconv/Makefile (xtests): Change dependencies to ....
(xtests-special): ... additions to this variable.
* iconvdata/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* intl/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable. Also add
$(objpfx)tst-gettext.out.
* io/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* libio/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* malloc/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* misc/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* nptl/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* nptl_db/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* posix/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(xtests): Change dependencies to ....
(xtests-special): ... additions to this variable.
* resolv/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(xtests): Change dependencies to ....
(xtests-special): ... additions to this variable.
* stdio-common/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
(do-tst-unbputc): Remove target.
(do-tst-printf): Likewise.
* stdlib/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* string/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
* sysdeps/x86/Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
localedata:
* Makefile (tests): Change dependencies to ....
(tests-special): ... additions to this variable.
In <https://sourceware.org/ml/libc-alpha/2014-01/msg00196.html> I
noted it was necessary to add includes of Makeconfig early in various
subdirectory makefiles for the tests-special variable settings added
by that patch to be conditional on configuration information. No-one
commented on the general question there of whether Makeconfig should
always be included immediately after the definition of subdir.
This patch implements that early inclusion of Makeconfig in each
directory (which is a lot easier than consistent placement of includes
of Rules). Includes are added if needed, or moved up if already
present. Subdirectory "all:" targets are removed, since Makeconfig
provides one.
There is potential for further cleanups I haven't done. Rules and
Makerules have code such as
ifneq "$(findstring env,$(origin headers))" ""
headers :=
endif
to override to empty any value of various variables that came from the
environment. I think there is a case for Makeconfig setting all the
subdirectory variables (other than subdir) to empty to ensure no
outside value is going to take effect if a subdirectory fails to
define a variable. (A list of such variables, possibly out of date
and incomplete, is in manual/maint.texi.) Rules and Makerules would
give errors if Makeconfig hadn't already been included, instead of
including it themselves. The special code to override values coming
from the environment would then be obsolete and could be removed.
Tested x86_64, including that installed binaries are identical before
and after the patch.
* argp/Makefile: Include Makeconfig immediately after defining
subdir.
* assert/Makefile: Likewise.
* benchtests/Makefile: Likewise.
* catgets/Makefile: Likewise.
* conform/Makefile: Likewise.
* crypt/Makefile: Likewise.
* csu/Makefile: Likewise.
(all): Remove target.
* ctype/Makefile: Include Makeconfig immediately after defining
subdir.
* debug/Makefile: Likewise.
* dirent/Makefile: Likewise.
* dlfcn/Makefile: Likewise.
* gmon/Makefile: Likewise.
* gnulib/Makefile: Likewise.
* grp/Makefile: Likewise.
* gshadow/Makefile: Likewise.
* hesiod/Makefile: Likewise.
* hurd/Makefile: Likewise.
(all): Remove target.
* iconvdata/Makefile: Include Makeconfig immediately after
defining subdir.
* inet/Makefile: Likewise.
* intl/Makefile: Likewise.
* io/Makefile: Likewise.
* libio/Makefile: Likewise.
(all): Remove target.
* locale/Makefile: Include Makeconfig immediately after defining
subdir.
* login/Makefile: Likewise.
* mach/Makefile: Likewise.
(all): Remove target.
* malloc/Makefile: Include Makeconfig immediately after defining
subdir.
(all): Remove target.
* manual/Makefile: Include Makeconfig immediately after defining
subdir.
* math/Makefile: Likewise.
* misc/Makefile: Likewise.
* nis/Makefile: Likewise.
* nss/Makefile: Likewise.
* po/Makefile: Likewise.
(all): Remove target.
* posix/Makefile: Include Makeconfig immediately after defining
subdir.
* pwd/Makefile: Likewise.
* resolv/Makefile: Likewise.
* resource/Makefile: Likewise.
* rt/Makefile: Likewise.
* setjmp/Makefile: Likewise.
* shadow/Makefile: Likewise.
* signal/Makefile: Likewise.
* socket/Makefile: Likewise.
* soft-fp/Makefile: Likewise.
* stdio-common/Makefile: Likewise.
* stdlib/Makefile: Likewise.
* streams/Makefile: Likewise.
* string/Makefile: Likewise.
* sunrpc/Makefile: Likewise.
(all): Remove target.
* sysvipc/Makefile: Include Makeconfig immediately after defining
subdir.
* termios/Makefile: Likewise.
* time/Makefile: Likewise.
* timezone/Makefile: Likewise.
(all): Remove target.
* wcsmbs/Makefile: Include Makeconfig immediately after defining
subdir.
* wctype/Makefile: Likewise.
libidn/ChangeLog:
* Makefile: Include Makeconfig immediately after defining subdir.
localedata/ChangeLog:
* Makefile: Include Makeconfig immediately after defining subdir.
(all): Remove target.
nptl/ChangeLog:
* Makefile: Include Makeconfig immediately after defining subdir.
nptl_db/ChangeLog:
* Makefile: Include Makeconfig immediately after defining subdir.