Commit Graph

1724 Commits

Author SHA1 Message Date
Mike FABIAN
aceda10bd5 Adapt collation in th_TH locale to use the iso14651_t1_common file and sync the collation with CLDR
I made it to agree as much as possible with the rules from CLDR (see:
https://github.com/unicode-org/cldr/blob/main/common/collation/th.xml).

It seems to be impossible to follow the CLDR rules

  &[before 1]๚<ฯ # should be "variable"

and

  &๛<ๆ # should be "variable"

exactly though. These ask for a primary difference in punctuation
characters whose primary weight should be "IGNORE". But using a
secondary differnence instead still sorts the test data correctly and
the previously used collation in th_TH used tertiary differences for
these characters.

There was old localedata/th_TH.in test data in TIS-620 encoding which
was not used (it was not in the localedata/Makefile). I converted this
to UTF-8 and moved it to localedata/th_TH.UTF-8.in and added it to
localedata/Makefile.

Using the existing collation rules in the th_TH locale did not sort that
test file completely correct, I think my new collation rules based on
iso14651_t1 are better.
2023-09-21 10:34:35 +02:00
Mike FABIAN
bb5bbc2070 Update to Unicode 15.1.0 [BZ #30854]
Unicode 15.1.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 15.1.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

    Total removed characters in newly generated CHARMAP: 0
    Total changed characters in newly generated CHARMAP: 0
    Total added characters in newly generated CHARMAP: 627
    Total removed characters in newly generated WIDTH: 0
    Total changed characters in newly generated WIDTH: 0
    Total added characters in newly generated WIDTH: 627

    alpha: Added 622 characters in new ctype which were not in old ctype
    graph: Added 627 characters in new ctype which were not in old ctype
    print: Added 627 characters in new ctype which were not in old ctype
    punct: Added 5 characters in new ctype which were not in old ctype
        The five characters added to punct are:
        2FFC;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM RIGHT;So;0;ON;;;;;N;;;;;
        2FFD;IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM LOWER RIGHT;So;0;ON;;;;;N;;;;;
        2FFE;IDEOGRAPHIC DESCRIPTION CHARACTER HORIZONTAL REFLECTION;So;0;ON;;;;;N;;;;;
        2FFF;IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION;So;0;ON;;;;;N;;;;;
        31EF;IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION;So;0;ON;;;;;N;;;;;

    The Unicode announcement blog entry says "[...] adds 627
    characters, [...] additions include 622 CJK unified ideographs in
    a new block, [...]", so that looks OK. The Unicode
    blog mentions "six completely new emoji" but they don't appear here as
    they are all sequences and not single code points.

Resolves: BZ #30854

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-16 08:37:03 +02:00
Mike FABIAN
71de3aead9 localedata/unicode-gen/utf8_gen.py: adapt regexp to get relevant lines from EastAsianWidth.txt
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-16 08:37:02 +02:00
Mike FABIAN
ba017b4f9d Fix regexp syntax warnings in localedata/unicode-gen/ctype_compatibility.py
Fix these:

$ python -m py_compile ./ctype_compatibility.py
./ctype_compatibility.py:146: SyntaxWarning: invalid escape sequence '\)'

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-09-16 08:37:02 +02:00
lijianglin
e1d3312015 add GB18030-2022 charmap and test the entire GB18030 charmap [BZ #30243]
support GB18030-2022 after add and change some transcoding relationship
of GB18030-2022.Details are as follows:
add 25 transcoding relationship
  UE81E 0x82359037
  UE826 0x82359038
  UE82B 0x82359039
  UE82C 0x82359130
  UE832 0x82359131
  UE843 0x82359132
  UE854 0x82359133
  UE864 0x82359134
  UE78D 0x84318236
  UE78F 0x84318237
  UE78E 0x84318238
  UE790 0x84318239
  UE791 0x84318330
  UE792 0x84318331
  UE793 0x84318332
  UE794 0x84318333
  UE795 0x84318334
  UE796 0x84318335
  UE816 0xfe51
  UE817 0xfe52
  UE818 0xfe53
  UE831 0xfe6c
  UE83B 0xfe76
  UE855 0xfe91
change 6 transcoding relationship
  U20087 0x95329031
  U20089 0x95329033
  U200CC 0x95329730
  U215D7 0x9536b937
  U2298F 0x9630ba35
  U241FE 0x9635b630
Test the entire GB18030 charmap, not only the Unicode BMP part.

Co-authored-by: yangyanchao <yangyanchao6@huawei.com>
Co-authored-by: liqingqing <liqingqing3@huawei.com>
Co-authored-by: Bruno Haible <bruno@clisp.org>
Reviewed-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Mike FABIAN <mfabian@redhat.com>
2023-08-29 19:02:30 +02:00
Colin Leroy-Mira
dfe8c44588 localedata: Translit common emojis to smileys [BZ #30649]
Add common emojis to the translit-able characters (mostly
faces and hearts), and translit them to old-fashioned
smileys.

Signed-off-by: Colin Leroy-Mira <colin@colino.net>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-08-29 09:31:23 +02:00
Florian Weimer
4dc6b2dfb0 localedata: de_DE should not use Fräulein
This honorific has fallen out of use quite some time ago.
2023-02-27 16:54:22 +01:00
Joseph Myers
6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Mike FABIAN
7fe6734d28 Update to Unicode 15.0.0 [BZ #29604]
Unicode 15.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 15.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

    Total added characters in newly generated CHARMAP: 4489
    Total removed characters in newly generated WIDTH: 0
    Total changed characters in newly generated WIDTH: 0
    Total added characters in newly generated WIDTH: 4257

    alpha: Added 4389 characters in new ctype which were not in old ctype
    combining: Added 42 characters in new ctype which were not in old ctype
    combining_level3: Added 34 characters in new ctype which were not in old ctype
    graph: Added 4489 characters in new ctype which were not in old ctype
    lower: Added 73 characters in new ctype which were not in old ctype
    print: Added 4489 characters in new ctype which were not in old ctype
    punct: Missing 5 characters of old ctype in new ctype
        punct: Missing: ఄ 0xc04 TELUGU SIGN COMBINING ANUSVARA ABOVE
        punct: Missing: ྂ 0xf82 TIBETAN SIGN NYI ZLA NAA DA
        punct: Missing: ྃ 0xf83 TIBETAN SIGN SNA LDAN
        punct: Missing: 𑂀 0x11080 KAITHI SIGN CANDRABINDU
        punct: Missing: 𑂁 0x11081 KAITHI SIGN ANUSVARA
            That’s OK, because these are now Alphabetic in DerivedCoreProperties.txt
    punct: Added 105 characters in new ctype which were not in old ctype

Resolves: BZ #29604
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-10-06 08:58:33 +02:00
Adhemerval Zanella Netto
de477abcaa Use '%z' instead of '%Z' on printf functions
The Z modifier is a nonstandard synonymn for z (that predates z
itself) and compiler might issue an warning for in invalid
conversion specifier.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-22 08:48:04 -03:00
Florian Weimer
1d78299911 localedata: Convert French language locales (fr_*) to UTF-8 2022-08-17 11:07:00 +02:00
Florian Weimer
01441ae333 de_DE: Convert to UTF-8
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2022-07-05 09:07:02 +02:00
Emil Soleyman-Zomalan
3e29dc5233 Add locale for syr_SY 2022-04-21 13:05:40 +02:00
Ilyahoo Proshel
189906b687 Add rif_MA locale [BZ #27781]
Resolves: BZ #27781
2022-04-07 14:59:41 +02:00
Adhemerval Zanella
74942fd273 localedate: Fix printf type on tst_mbrtowc
Checked on x86_64-linux-gnu and i686-linux-gnu.
2022-03-31 08:49:55 -03:00
Adhemerval Zanella
d1eefcb2a0 localedata: Remove unused variables in tests
Checked on x86_64-linux-gnu and i686-linux-gnu.
2022-03-31 08:38:35 -03:00
Carlos O'Donell
1c7a34567d localedata: Do not generate output if warnings were present.
With LC_MONETARY parsing fixed we can now generate locales
without forcing output with '-c'.

Removing '-c' from localedef invocation is the equivalent of
using -Werror for localedef.  The glibc locale sources should
always be clean and free from warnings.

We remove '-c' from both test locale generation and the targets
used for installing locales e.g. install-locale-archive, and
install-locale-files.

Tested on x86_64 and i686 without regressions.
Tested with install-locale-archive target.
Tested with install-locale-files target.

Reviewed-by: DJ Delorie <dj@redhat.com>
2022-02-25 07:31:27 -05:00
Carlos O'Donell
7e0ad15c0f localedata: Adjust C.UTF-8 to align with C/POSIX.
We have had one downstream report from Canonical [1] that
an rrdtool test was broken by the differences in LC_TIME
that we had in the non-builtin C locale (C.UTF-8). If one
application has an issue there are going to be others, and
so with this commit we review and fix all the issues that
cause the builtin C locale to be different from C.UTF-8,
which includes:
* mon_decimal_point should be empty e.g. ""
 - Depends on mon_decimal_point_wc fix.
* negative_sign should be empty e.g. ""
* week should be aligned with the builtin C/POSIX locale
* d_fmt corrected with escaped slashes e.g. "%m//%d//%y"
* yesstr and nostr should be empty e.g. ""
* country_ab2 and country_ab3 should be empty e.g. ""

We bump LC_IDENTIFICATION version and adjust the date to
indicate the change in the locale.

A new tst-c-utf8-consistency test is added to ensure
consistency between C/POSIX and C.UTF-8.

Tested on x86_64 and i686 without regression.

[1] https://sourceware.org/pipermail/libc-alpha/2022-January/135703.html

Co-authored-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-02-01 11:12:36 -05:00
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Maxim Kuvyrkov
c16dc431c8 Update copyright header in recently merged ab_GE locale
ab_GE locale was committed under DCO and this header
proposed in [1] suits it better.

[1] https://sourceware.org/pipermail/libc-alpha/2021-September/130692.html

Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com>
2021-12-17 18:22:21 +00:00
Nart Tlisha
a16c5ab139 localedata: add new locale ab_GE
Add the Abkhazian language in the Georgia territory

The ab_GE was just recently added to CLDR, it should be available
in CLDR v41, https://github.com/unicode-org/cldr/pull/1402

The Abkhazian language has been added to Gnome for localization

The locale has been tested on Ubuntu 20.04, Mint 20.2 and Fedora 35 Beta

Signed-off-by: Nart Tlisha <daniel.abzakh@gmail.com>
Reviewed-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
2021-12-16 14:37:14 +00:00
Adhemerval Zanella
3a523ccd78 locale: Fix localedata/sort-test undefined behavior
The collate-test.c triggers UB with an signed integer overflow,
which results in an error on some architectures (powerpc32).

Checked on x86_64, i686, and powerpc.
2021-11-08 15:28:48 -03:00
Mike FABIAN
b517256015 Update to Unicode 14.0.0 [BZ #28390]
Unicode 14.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 14.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Total added characters in newly generated CHARMAP: 838
Total removed characters in newly generated WIDTH: 1
    (Characters not in WIDTH get width 1 by default, i.e. these have width 1 now.)
    removed: <U1734> 0 : eaw=N category=Mc bidi=L   name=HANUNOO SIGN PAMUDPOD
    That seems intentional, the character had category Mn (Mark, nonspacing) before
    and now has Mc (Mark, spacing combining)
Total changed characters in newly generated WIDTH: 0
Total added characters in newly generated WIDTH: 175
2021-10-04 08:54:27 +02:00
Carlos O'Donell
466f2be6c0 Add generic C.UTF-8 locale (Bug 17318)
We add a new C.UTF-8 locale. This locale is not builtin to glibc, but
is provided as a distinct locale. The locale provides full support for
UTF-8 and this includes full code point sorting via STRCMP-based
collation (strcmp or wcscmp).

The collation uses a new keyword 'codepoint_collation' which drops all
collation rules and generates an empty zero rules collation to enable
STRCMP usage in collation. This ensures that we get full code point
sorting for C.UTF-8 with a minimal 1406 bytes of overhead (LC_COLLATE
structure information and ASCII collating tables).

The new locale is added to SUPPORTED. Minimal test data for specific
code points (minus those not supported by collate-test) is provided in
C.UTF-8.in, and this verifies code point sorting is working reasonably
across the range. The locale was tested manually with the full set of
code points without failure.

The locale is harmonized with locales already shipping in various
downstream distributions. A new tst-iconv9 test is added which verifies
the C.UTF-8 locale is generally usable.

Testing for fnmatch, regexec, and recomp is provided by extending
bug-regex1, bugregex19, bug-regex4, bug-regex6, transbug, tst-fnmatch,
tst-regcomp-truncated, and tst-regex to use C.UTF-8.

Tested on x86_64 or i686 without regression.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-09-06 11:30:28 -04:00
Siddhesh Poyarekar
30891f35fa Remove "Contributed by" lines
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date.  Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.

Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions.  These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.

The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively.  These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:

https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-09-03 22:06:44 +05:30
Siddhesh Poyarekar
2d2d9f2b48 Move malloc hooks into a compat DSO
Remove all malloc hook uses from core malloc functions and move it
into a new library libc_malloc_debug.so.  With this, the hooks now no
longer have any effect on the core library.

libc_malloc_debug.so is a malloc interposer that needs to be preloaded
to get hooks functionality back so that the debugging features that
depend on the hooks, i.e. malloc-check, mcheck and mtrace work again.
Without the preloaded DSO these debugging features will be nops.
These features will be ported away from hooks in subsequent patches.

Similarly, legacy applications that need hooks functionality need to
preload libc_malloc_debug.so.

The symbols exported by libc_malloc_debug.so are maintained at exactly
the same version as libc.so.

Finally, static binaries will no longer be able to use malloc
debugging features since they cannot preload the debugging DSO.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
2021-07-22 18:37:59 +05:30
Siddhesh Poyarekar
06a1b79407 Reinstate gconv-modules as the default configuration file
Reinstate gconv-modules as the main file so that the configuration
files in gconv-modules.d/ become add-on configuration.  With this, the
effective user visible change is that GCONV_PATH can now have
supplementary configuration in GCONV_PATH/gconv-modules.d/ in addition
to the main GCONV_PATH/gconv-modules file.
2021-06-14 18:38:09 +05:30
Siddhesh Poyarekar
fc5bfade69 iconvdata: Move gconv-modules configuration to gconv-modules.conf
Move all gconv-modules configuration files to gconv-modules.conf.
That is, the S390 extensions now become gconv-modules-s390.conf.  Move
both configuration files into gconv-modules.d.

Now GCONV_PATH/gconv-modules is read only for backward compatibility
for third-party gconv modules directories.

Reviewed-by: DJ Delorie <dj@redhat.com>
2021-06-09 09:47:16 +05:30
Florian Weimer
f17164bd51 localedata: Use U+00AF MACRON in more EBCDIC charsets [BZ #27882]
This updates IBM256, IBM277, IBM278, IBM280, IBM284, IBM297, IBM424
in the same way that IBM273 was updated for bug 23290.

IBM256 and IBM424 still have holes after this change, so HAS_HOLES
is not updated.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-05-18 07:21:45 +02:00
Sebastian Rasmussen
ebde2baeb5 Update sv_SE to treate 'W' as a distinct character (Bug 25036)
The 13th edition of Svenska Akademiens ordlista lists 'W' as a
distinct letter that sorts after 'V'. We adjust the sv_SE locale
(and tests) to match this updated and "reformed" language change.
This harmonizes us with CLDR 1.5.0 (2007) for sv_SE sorting of
the letter 'W'.

No regressions on x86_64, and locale sorting tests all pass.

Co-authored-by: Carlos O'Donell <carlos@redhat.com>
2021-04-06 12:34:02 -04:00
Marc Aurèle La France
c6e2ca2c3f POSIX locale: Fix typo in comment 2021-01-09 12:14:44 +01:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Andreas Schwab
8f8052c2aa Revert "Fix missing redirects in testsuite targets"
This reverts commit d5afb38503.  The log files are actually created by the
various shell scripts that drive the tests.
2020-10-08 10:09:30 +02:00
Carlos O'Donell
8cde977077 en_US: Minimize changes to date_fmt (Bug 25923)
In 2000 when date_fmt was originally added as an extension the
en_US locale did not have a date_fmt specifier and so used the
default which resulted in the abbreviated month name coming
before the day of the month (as expected in the US and other
locales).  In commit 7395f3a0ef the
date_fmt was added to en_US with a 12H time to better align with
US user expectations.  Unfortunately the abbreviated month name
and day were inverted during that transition, and that was seen
as a regression and reported against Fedora 32:
https://bugzilla.redhat.com/show_bug.cgi?id=1830623

The progression of date_fmt looks like this:
"%a %b %e %H:%M:%S %Z %Y"    <- Originally (2000)
"%a %d %b %Y %I:%M:%S %p %Z" <- glibc 2.29 (2019)
"%a %b %e %r %Z %Y"          <- glibc 2.32 (2020) [this commit]

Note: "%r" is "%I:%M:%S %p" in en_US and so shorter to write.

Likewise the year is in the wrong place in commit
7395f3a0ef and this is corrected in
this patch.

For reference d_t_fmt:
"%a %d %b %Y %r %Z"          <- d_t_fmt    (1997)

Yes, d_t_fmt and date_fmt are *not* the same, this is just the
history of this locale. This commit does not change d_t_fmt to
better align with date_fmt. No users have requested we change
d_t_fmt or given any justification for such a change.

The only goals of this change are to place the abbreviated month
name before the day of the month as it has been printed since
2000, and place the year at the end. This minimizes the change
from commit 7395f3a0ef and makes
good on changing only from 24H clock to 12H clock.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2020-07-16 17:17:10 -04:00
Mike FABIAN
6e540caa21 Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-06-26 09:54:43 +02:00
Florian Weimer
3404def00a ckb_IQ, or_IN locales: Add missing reorder-end keywords
This suppresses a non-fatal error during locale building.

Reviewed-by: Rafał Lużyński <digitalfreak@lingonborough.com>
2020-05-08 10:52:22 +02:00
Carlos O'Donell
df6c63ebbc localedef: Add tests-container test for --no-hard-links.
The new tst-localedef-hardlinks verifies that when compiling
two locales (with default output directory) one with
--no-hard-links and one without the option, results in the
expected behaviour.  When --no-hard-links is used the link
counts on LC_CTYPE is 1, indicating that even thoug the two
locale are identical (though different named source files and
output direcotry) the localedef did not carry out the hard
link optimization.  Then when --no-hard-links is omitted the
localedef hard link optimization is correctly carried out and
for 2 compiled locales the link count for LC_CTYPE is 2.

Reviewed-by: DJ Delorie <dj@redhat.com>
2020-04-30 16:28:07 -04:00
Mike FABIAN
8645f62469 Bug 25819: Update to Unicode 13.0.0
Unicode 13.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 13.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Total added characters in newly generated CHARMAP: 5930
Total added characters in newly generated WIDTH: 5536
2020-04-21 18:17:23 +02:00
kokoye2007
8a1d13d0c7 Updates to the shn_MM locale [BZ #25532] 2020-04-08 12:22:36 +02:00
Rafał Lużyński
10b2cdc3b3 oc_FR locale: Fix spelling of April (bug 25639)
Confirmed by CLDR and a native speaker: "abril" is more often used even
if "abrial" is also correct.  Both nominative (alt_mon) and genitive (mon)
cases are updated.
2020-04-07 00:20:53 +02:00
Rafał Lużyński
649fdf039b oc_FR locale: Fix spelling of Thursday (bug 25639)
As reported by a native speaker:

Thursday: "dijóus" -> "dijòus" (also confirmed by CLDR)
2020-03-19 00:19:07 +01:00
Mike FABIAN
eb948facd8 Fix typo in the name for Wednesday in Kurdish [BZ #9809] 2020-02-11 10:18:45 +01:00
Mike FABIAN
cdeae33d71 Update or_IN collation [BZ #22525]
- Add a test file or_IN.UTF-8.in.
- Make the collation agree with CLDR.
2020-02-03 10:19:20 +01:00
Mike FABIAN
ae199e7d64 Fix ckb_IQ [BZ #9809]
Add ckb_IQ to SUPPORTED file.
Add ckb_IQ.UTF-8.in collation test file.
Mention new ckb_IQ locale in NEWS.
2020-02-03 10:19:20 +01:00
Jwtiyar Nariman
4267522f5e Add new locale: ckb_IQ (Kurdish/Sorani spoken in Iraq) [BZ #9809] 2020-02-03 10:19:20 +01:00
Rafał Lużyński
135540285c sl_SI locale: Use "." as the thousands separator (bug 25233)
This is correct according to CLDR [1] and Florian Weimer's quick
research. [2]

[1] https://st.unicode.org/cldr-apps/v#/sl/Symbols/
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=25233#c0

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2020-01-08 00:13:48 +01:00
Rafał Lużyński
75ba929987 Multiple locales: Add date_fmt (bug 24054)
It is not specified what should be the content of d_t_fmt and date_fmt
but in the built-in C locale those fields have only one difference:
date_fmt contains "%Z" (the current time zone) while d_t_fmt does not.

For most of the locales this commit does the following operation:
copy d_t_fmt to date_fmt, and then remove "%Z" from d_t_fmt.
If "%Z" was originally missing from d_t_fmt add it to date_fmt.
It also corrects comments where necessary.

Exceptions:

* In bo_CN, dz_BT, and km_KH "%Z" has not been added to date_fmt because
  it was too difficult.  In these locales date_fmt has been set to the
  copy of d_t_fmt.
* In en_DK "%Z" has not been removed from d_t_fmt in order to preserve
  the conformance with the standard mentioned in the comment.

The command to identify and initially edit the locales that need the
update was:

    for i in `grep -lw d_t_fmt *`
    do
        if ! grep -qw date_fmt $i ; then
            awk '/d_t_fmt/ { print $0; gsub("d_t_fmt", "date_fmt"); } //{ print $0 }' < $i > $i.next
            mv $i.next $i
        fi
    done

and then each file was further edited manually.
2020-01-02 11:45:45 +01:00
Joseph Myers
d614a75396 Update copyright dates with scripts/update-copyrights. 2020-01-01 00:14:33 +00:00
Rafał Lużyński
d99b500e3d lv_LV locale: Correct the time part of d_t_fmt (bug 25324)
Currently d_t_fmt formats time as "plkst. %H un %M".  A quick Google
search says that "plkst." means "o’clock" and "un" means "and".
Also this format does not display seconds.

CLDR does not mention anything like that.  We have no reason to use
anything different than "%H:%M:%S".
2019-12-30 11:48:20 +01:00
Rafał Lużyński
20a740b2b2 km_KH locale: Use "%M" instead of "m" in d_t_fmt (bug 25323)
A quick analysis suggests that the original author meant "%M" (minutes
format specifier) instead of "m" which is just a literal "m" letter.
2019-12-30 11:48:19 +01:00
Rafał Lużyński
b8c210bcc7 mnw_MM, my_MM, and shn_MM locales: Do not use %Op
The "O" modifier does nothing when used with "%p" so let's better not
use it at all and replace "%Op" with "%p".
2019-12-23 23:49:22 +01:00
Rafał Lużyński
c372d2e863 ru_UA locale: use copy "ru_RU" in LC_TIME (bug 25044)
Replacing incorrect abbreviated weekday names "Пнд", "Вто", "Срд"...
with correct ones "Пн", "Вт", "Ср"... makes the LC_TIME sections in
those two locales almost identical.  The only remaining difference
was that ab_alt_mon elements in ru_UA were lowercase while in ru_RU
they had the first letter uppercase, the latter was pointed as
a better choice by a native speaker.  This commit unifies LC_TIME
between ru_RU and ru_UA.
2019-11-26 11:54:29 +01:00
Talachan Mon
c5fbd7c3ea Add new locale: mnw_MM (Mon language spoken in Myanmar) [BZ #25139] 2019-11-06 08:15:16 +01:00
Arjun Shankar
513aaa0d78 Add Transliterations for Unicode Misc. Mathematical Symbols-A/B [BZ #23132]
This commit adds previously missing transliterations for several code points
in the Unicode blocks "Miscellaneous Mathematical Symbols-A/B" -
transliterated to their approximate ASCII representations.  It also adds a
corresponding iconv transliteration test.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2019-10-25 19:45:55 +02:00
DJ Delorie
97476447ed Install charmaps uncompressed in testroot
The testroot does not have a gunzip command, so the charmap files
should not be installed gzipped else they cannot be used (and thus
tested).  With this patch, installing with INSTALL_UNCOMPRESSED=yes
installs uncompressed charmaps instead.

Note that we must purge the $(symbolic_link_list) as it contains
references to $(DESTDIR), which we change during the testroot
installation.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2019-10-24 17:01:04 -04:00
Mike FABIAN
8e42fc6811 Sync "language", "lang_name", "territory", "country_name" with CLDR/langtable
Sync these values with CLDR and langtable as much as possible.  Add
missing values.

If possible, take the values from CLDR, if CLDR does not have it,
take it from langtable. The values from langtable which are not from
CLDR are from  Wikipedia or native speakers.
2019-10-01 10:27:02 +02:00
Paul Eggert
5a82c74822 Prefer https to http for gnu.org and fsf.org URLs
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:

sed -ri '
  s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
  s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
  $(find $(git ls-files) -prune -type f \
      ! -name '*.po' \
      ! -name 'ChangeLog*' \
      ! -path COPYING ! -path COPYING.LIB \
      ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
      ! -path manual/texinfo.tex ! -path scripts/config.guess \
      ! -path scripts/config.sub ! -path scripts/install-sh \
      ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
      ! -path INSTALL ! -path  locale/programs/charmap-kw.h \
      ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
      ! '(' -name configure \
            -execdir test -f configure.ac -o -f configure.in ';' ')' \
      ! '(' -name preconfigure \
            -execdir test -f preconfigure.ac ';' ')' \
      -print)

and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:

  chmod a+x sysdeps/unix/sysv/linux/riscv/configure
  # Omit irrelevant whitespace and comment-only changes,
  # perhaps from a slightly-different Autoconf version.
  git checkout -f \
    sysdeps/csky/configure \
    sysdeps/hppa/configure \
    sysdeps/riscv/configure \
    sysdeps/unix/sysv/linux/csky/configure
  # Omit changes that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
  git checkout -f \
    sysdeps/powerpc/powerpc64/ppc-mcount.S \
    sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
  # Omit change that caused a pre-commit check to fail like this:
  # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
  git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 02:43:31 -07:00
Rafal Luzynski
c0fd3244e7 Chinese locales: Set first_weekday to 2 (bug 24682).
The first day of the week in China (Mainland) should be Monday according
to the national standard GB/T 7408-2005.  References:

* https://www.doc88.com/p-1166696540287.html
* https://unicode-org.atlassian.net/browse/CLDR-11510

	[BZ #24682]
	* localedata/locales/bo_CN (first_weekday): Add, set to 2 (Monday).
	* localedata/locales/ug_CN (first_weekday): Likewise.
	* localedata/locales/zh_CN (first_weekday): Likewise.
2019-08-23 00:07:06 +02:00
Rafal Luzynski
9208c3b804 Afar locales: Months and days updated from CLDR (bug 21897).
This commit updates month and weekday names (full and abbreviated)
from CLDR 35.1 with the following exceptions.

It was not clear why the full name of February in aa_DJ and aa_ER was
"Kudo" while the abbreviated version is "Nah" but some additional
sources [1] [2] as well as the content of aa_ER and aa_ER@saaho
suggest it should be "Naharsi Kudo".  This commit consequently sets
the translation of February to "Naharsi Kudo" in aa_DJ and aa_ET.

aa_ER@saaho is not supported by CLDR but since the month names were
identical to aa_ER before this commit, the same values have been copied
from aa_ER.

Links:

[1] https://fr.wiktionary.org/wiki/naharsi_kudo
[2] http://www.mcit.gov.et/web/guest/-/localization-standard-for-afaraf

	[BZ #21897]
	* localedata/locales/aa_DJ (abday): Update from CLDR, all words
	begin with an uppercase letter now.
	(abmon): Likewise.
	(mon): Update from CLDR, reword February from "Kudo" to
	"Naharsi Kudo", April from "Agda Baxisso" to "Agda Baxis",
	and August from "Liiqen" to "Leqeeni".
	* localedata/locales/aa_ER (mon): Update from CLDR, reword
	April from "Agda Baxisso" to "Agda Baxis" and August from
	"Leqeeni" to "Liiqen".
	* localedata/locales/aa_ER@saaho (mon): Likewise.
	* localedata/locales/aa_ET (abmon): Update from CLDR, reword
	abbreviated February from "Kud" to "Nah".
	(mon): Update from CLDR, reword February from "Kudo" to
	"Naharsi Kudo" and April from "Agda Baxisso" to "Agda Baxis".
2019-07-17 11:58:21 +02:00
Rafal Luzynski
fba6d4bbce nl_BE locale: Use "copy "nl_NL"" in LC_NAME (bug 23996).
The content of the section is identical in both languages.

	[BZ #23996]
	* localedata/locales/nl_BE (LC_NAME): Replace with “copy "nl_NL"”.
2019-07-17 11:53:08 +02:00
PanderMusubi
3cc7c9c5f1 nl_BE and nl_NL locales: Dutch salutations (bug 23996).
[BZ #23996]
	* localedata/locales/nl_BE (LC_NAME): Add name_gen, name_mr,
	name_mrs, name_miss, and name_ms.
	* localedata/locales/nl_NL (LC_NAME): Likewise.
2019-07-17 11:50:42 +02:00
Daniil Zhilin
cce7b6a578 ga_IE and en_IE locales: Revert first_weekday removal (bug 24200).
These values were removed by the commit 0a410e76f5.

	[BZ #24200]
	* localedata/locales/ga_IE (first_weekday): Add, set to 2 (Monday).
	* localedata/locales/en_IE (first_weekday): Likewise.
2019-07-17 11:41:24 +02:00
Rafal Luzynski
a55541fd1c szl_PL locale: Fix a typo in the previous commit (bug 24652).
The Unicode sequences in the format <Uxxxx> should be used instead of
non-ASCII characters.

Reported by Piotr Drąg:
https://sourceware.org/bugzilla/show_bug.cgi?id=24652#c8

	[BZ #24652]
	* localedata/locales/szl_PL (day): Use the correct Unicode
	sequences instead of non-ASCII characters.
2019-06-24 22:17:58 +02:00
Grzegorz Kulik
2bd81b60d6 szl_PL locale: Spelling corrections (bug 24652).
This commit also provides the correct month names in both nominative
and genitive case for Silesian language, as required by the fix for
the bug 10871.

	[BZ #24652]
	* localedata/locales/szl_PL (abday): Spelling corrections.
	(day): Likewise.
	(abmon): Likewise.
	(mon): Rename to...
	(alt_mon): This, then apply spelling corrections.
	(mon): New entry, month names in the genitive case.
2019-06-24 10:59:11 +02:00
Rafal Luzynski
fefa21790b nl_{AW,NL}: Correct the thousands separator and grouping (bug 23831).
According to CLDR 35.1 and the bug report the thousands grouping
separator should be always "." (a single dot) and digits should be
grouped by 3.

	[BZ #23831]
	* localedata/locales/nl_AW (mon_thousands_sep): Set to ".".
	* localedata/locales/nl_NL (mon_thousands_sep): Likewise.
	(thousands_sep): Likewise.
	(grouping): Set to 3;3.
2019-06-21 20:48:35 +02:00
Rafal Luzynski
f59a54ab0c nl_AW locale: Correct the negative monetary format (bug 24614).
Follow the same changes as made in the commit 02d8b5ab1c because the
respective entries in nl_NL and nl_AW had been the same before the change
so they should be the same after.  CLDR does not provide complete data
for nl_AW, it says it is missing and displays a copy of nl_NL.

	[BZ #24614]
	* localedata/locales/nl_AW (n_sep_by_space): Set to 2 (a space
	between the currency symbol and the minus sign).
	(n_sign_posn): Set to 4 (the minus sign after the currency symbol).
2019-06-19 23:44:47 +02:00
Rafal Luzynski
02d8b5ab1c nl_NL locale: Correct the negative monetary format (bug 24614).
According to CLDR 35.1 and the bug report the correct monetary format
for negative amounts should be "EUR -1 234,56" while previously it was
"EUR 1 234,56-".

This patch does not change the thousands (grouping) separator.

	[BZ #24614]
	* localedata/Makefile (LOCALES): Add nl_NL.UTF-8.
	* localedata/locales/nl_NL (n_sep_by_space): Set to 2 (a space
	between the currency symbol and the minus sign).
	(n_sign_posn): Set to 4 (the minus sign after the currency symbol).
	* localedata/tst-strfmon1.c (tests): Add test data for nl_NL.UTF-8.
2019-06-17 23:42:06 +02:00
mansayk
157cda1ff0 tt_RU: Add lang_name [BZ #24370]
This commit adds a lang_name according to CLDR-35.1.

	[BZ #24370]
	* localedata/locales/tt_RU (lang_name): Add from CLDR-35.1.
2019-05-28 22:13:32 +02:00
mansayk
182a3746b8 tt_RU: Fix orthographic mistakes in mon and abmon sections [BZ #24369]
This commit fixes some errors and converts all month names to lowercase.
The content is synchronized with CLDR-35.1 now but trailing dots are
removed from abmon values in order to maintain consistency with the
previous values and with many other locales which do the same.

	[BZ #24369]
	* localedata/locales/tt_RU (mon): Update from CLDR-35.1, fix errors.
	(abmon): Likewise, but remove the trailing dots.
2019-05-28 22:11:22 +02:00
Mike FABIAN
f6efec90c8 Bug 24535: Update to Unicode 12.1.0
Unicode 12.1.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 12.1.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Some info about the number of characters added or changed:

Total added characters in newly generated CHARMAP: 1
added: <U32FF>     /xe3/x8b/xbf SQUARE ERA NAME REIWA
Total added characters in newly generated WIDTH: 1
added: <U32FF> 2 : eaw=W category=So bidi=L   name=SQUARE ERA NAME REIWA
graph: Added 1 characters in new ctype which were not in old ctype
graph: Added: ㋿ U+32FF SQUARE ERA NAME REIWA
print: Added 1 characters in new ctype which were not in old ctype
print: Added: ㋿ U+32FF SQUARE ERA NAME REIWA
punct: Added 1 characters in new ctype which were not in old ctype
punct: Added: ㋿ U+32FF SQUARE ERA NAME REIWA
2019-05-13 17:25:03 +02:00
TAMUKI Shoichi
466afec308 ja_JP locale: Add entry for the new Japanese era [BZ #22964]
The Japanese era name will be changed on May 1, 2019.  The Japanese
government made a preliminary announcement on April 1, 2019.

The glibc ja_JP locale must be updated to include the new era name for
strftime's alternative year format support.

Checked on x86_64-linux-gnu.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

ChangeLog:

	[BZ #22964]
	* localedata/locales/ja_JP (LC_TIME): Add entry for the new Japanese
	era.
	* time/tst-strftime2.c (dates): Add 2019-04-30 and 2019-05-01.
	(mkreftable): Add rules for the new Japanese era and the new dates.
2019-04-02 16:46:55 +09:00
Carlos O'Donell
62449176e0 Add verbose comments to 'era' in ja_JP locale.
Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com>
Reviewed-by: TAMUKI Shoichi <tamuki@linet.gr.jp>
2019-04-01 15:14:16 -04:00
mansayk
57ada43c90 tt_RU: Fix orthographic mistakes in day and abday sections [BZ #24296]
This commit fixes some errors and converts all weekday names to lowercase.
The content is synchronized with CLDR-34 now, but trailing dots are removed
from abday values in order to maintain consistency with the previous values
and with many other locales which do the same.

	[BZ #24296]
	* localedata/locales/tt_RU (day): Update from CLDR-34, fix errors.
	(abday): Likewise, but remove the trailing dots.
2019-03-20 22:00:00 +01:00
Felix Yan
238d60a1fb localedata: Add Minguo calendar support to Taiwanese locales [BZ #24293]
Minguo calendar is the official calendar system, and very widely used in
Taiwan. This commit adds its support into glibc.

Some background information: The government website (www.gov.tw) uses it,
popular public services like Taiwan HSR also use this calendar system.

Link to Wikipedia: https://en.wikipedia.org/wiki/Minguo_calendar

        [BZ #24293]
        * localedata/locales/zh_TW (era): Add, support Minguo calendar.
        * localedata/locales/cmn_TW (era): Likewise.
        * localedata/locales/hak_TW (era): Likewise.
        * localedata/locales/lzh_TW (era): Likewise.
        * localedata/locales/nan_TW (era): Likewise.
2019-03-15 10:08:37 +01:00
Mike FABIAN
86bdd49d93 Bug 24307: Update to Unicode 12.0.0
Unicode 12.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 12.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).

Some info about the number of characters added or changed:

Total added characters in newly generated CHARMAP: 554
Total added characters in newly generated WIDTH: 106
alpha: Missing 8 characters of old ctype in new ctype
       (These are combining marks, apparently they were removed from alpha
       on purpose)
alpha: Added 295 characters in new ctype which were not in old ctype
combining: Missing 2 characters of old ctype in new ctype
       (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA,
       these are now "Alphabetic" in Unicode 12.0.0)
combining: Added 37 characters in new ctype which were not in old ctype
combining_level3: Missing 2 characters of old ctype in new ctype
       (U+1CF2 VEDIC SIGN ARDHAVISARGA and U+1CF3 VEDIC SIGN ROTATED ARDHAVISARGA,
       these are now "Alphabetic" in Unicode 12.0.0)
combining_level3: Added 26 characters in new ctype which were not in old ctype
graph: Added 554 characters in new ctype which were not in old ctype
lower: Added 6 characters in new ctype which were not in old ctype
print: Added 554 characters in new ctype which were not in old ctype
punct: Missing 29 characters of old ctype in new ctype
       (These characters have all  become "Alphabetic" in Unicode 12.0.0.
       Therefore, they are not in "punct" anymore (see: is_punct() in unicode_utils.py))
punct: Added 296 characters in new ctype which were not in old ctype
tolower: Added 7 characters in new ctype which were not in old ctype
totitle: Added 7 characters in new ctype which were not in old ctype
toupper: Added 7 characters in new ctype which were not in old ctype
upper: Added 7 characters in new ctype which were not in old ctype

	[BZ #24307]
	* localedata/unicode-gen/Makefile (UNICODE_VERSION): Set to 12.0.0.
	* localedata/unicode-gen/DerivedCoreProperties.txt: Update to Unicode 12.0.0.
	* localedata/unicode-gen/EastAsianWidth.txt: Likewise.
	* localedata/unicode-gen/PropList.txt: Likewise.
	* localedata/unicode-gen/UnicodeData.txt: Likewise.
	* localedata/unicode-gen/ctype_compatibility_test_cases.py: U+108D became
        "Alphabetic" in Unicode 12.0.0. Adapt test case.
	* localedata/charmaps/UTF-8: Regenerate.
	* localedata/locales/i18n_ctype: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/translit_circle: Likewise.
	* localedata/locales/translit_cjk_compat: Likewise.
	* localedata/locales/translit_combining: Likewise.
	* localedata/locales/translit_compat: Likewise.
	* localedata/locales/translit_font: Likewise.
	* localedata/locales/translit_fraction: Likewise.
2019-03-08 12:20:35 +01:00
TAMUKI Shoichi
31effacee2 ja_JP: Change the offset for Taisho gan-nen from 2 to 1 [BZ #24162]
The offset in era-string format for Taisho gan-nen (1912) is currently
defined as 2, but it should be 1.  So fix it.  "Gan-nen" means the 1st
(origin) year, Taisho started on July 30, 1912.

Reported-by: Morimitsu, Junji <junji.morimitsu@hpe.com>
Reviewed-by: Rafal Luzynski <digitalfreak@lingonborough.com>

ChangeLog:

	[BZ #24162]
	* localedata/locales/ja_JP (LC_TIME): Change the offset for Taisho
	gan-nen from 2 to 1.  Problem reported by Morimitsu, Junji.
2019-03-02 21:00:28 +09:00
Joseph Myers
34a5a1460e Break some lines before not after operators.
The GNU Coding Standards specify that line breaks in expressions
should go before an operator, not after one.  This patch fixes various
code to do this.  It only changes code that appears to be mostly
following GNU style anyway, not files and directories with
substantially different formatting.  It is not exhaustive even for
files using GNU style (for example, changes to sysdeps files are
deferred for subsequent cleanups).  Some files changed are shared with
gnulib, but most are specific to glibc.  Changes were made manually,
with places to change found by grep (so some cases, e.g. where the
operator was followed by a comment at end of line, are particularly
liable to have been missed by grep, but I did include cases where the
operator was followed by backslash-newline).

This patch generally does not attempt to address other coding style
issues in the expressions changed (for example, missing spaces before
'(', or lack of parentheses to ensure indentation of continuation
lines properly reflects operator precedence).

Tested for x86_64, and with build-many-glibcs.py.

	* benchtests/bench-memmem.c (simple_memmem): Break lines before
	rather than after operators.
	* benchtests/bench-skeleton.c (TIMESPEC_AFTER): Likewise.
	* crypt/md5.c (md5_finish_ctx): Likewise.
	* crypt/sha256.c (__sha256_finish_ctx): Likewise.
	* crypt/sha512.c (__sha512_finish_ctx): Likewise.
	* elf/cache.c (load_aux_cache): Likewise.
	* elf/dl-load.c (open_verify): Likewise.
	* elf/get-dynamic-info.h (elf_get_dynamic_info): Likewise.
	* elf/readelflib.c (process_elf_file): Likewise.
	* elf/rtld.c (dl_main): Likewise.
	* elf/sprof.c (generate_call_graph): Likewise.
	* hurd/ctty-input.c (_hurd_ctty_input): Likewise.
	* hurd/ctty-output.c (_hurd_ctty_output): Likewise.
	* hurd/dtable.c (reauth_dtable): Likewise.
	* hurd/getdport.c (__getdport): Likewise.
	* hurd/hurd/signal.h (_hurd_interrupted_rpc_timeout): Likewise.
	* hurd/hurd/sigpreempt.h (HURD_PREEMPT_SIGNAL_P): Likewise.
	* hurd/hurdfault.c (_hurdsig_fault_catch_exception_raise):
	Likewise.
	* hurd/hurdioctl.c (fioctl): Likewise.
	* hurd/hurdselect.c (_hurd_select): Likewise.
	* hurd/hurdsig.c (_hurdsig_abort_rpcs): Likewise.
	(STOPSIGS): Likewise.
	* hurd/hurdstartup.c (_hurd_startup): Likewise.
	* hurd/intr-msg.c (_hurd_intr_rpc_mach_msg): Likewise.
	* hurd/lookup-retry.c (__hurd_file_name_lookup_retry): Likewise.
	* hurd/msgportdemux.c (msgport_server): Likewise.
	* hurd/setauth.c (_hurd_setauth): Likewise.
	* include/features.h (__GLIBC_USE_DEPRECATED_SCANF): Likewise.
	* libio/libioP.h [IO_DEBUG] (CHECK_FILE): Likewise.
	* locale/programs/ld-ctype.c (set_class_defaults): Likewise.
	* localedata/tests-mbwc/tst_swscanf.c (tst_swscanf): Likewise.
	* login/tst-utmp.c (do_check): Likewise.
	(simulate_login): Likewise.
	* mach/lowlevellock.h (lll_lock): Likewise.
	(lll_trylock): Likewise.
	* math/test-fenv.c (ALL_EXC): Likewise.
	* math/test-fenvinline.c (ALL_EXC): Likewise.
	* misc/sys/cdefs.h (__attribute_deprecated_msg__): Likewise.
	* nis/nis_call.c (__do_niscall3): Likewise.
	* nis/nis_callback.c (cb_prog_1): Likewise.
	* nis/nis_defaults.c (searchaccess): Likewise.
	* nis/nis_findserv.c (__nis_findfastest_with_timeout): Likewise.
	* nis/nis_ismember.c (internal_ismember): Likewise.
	* nis/nis_local_names.c (nis_local_principal): Likewise.
	* nis/nss_nis/nis-rpc.c (_nss_nis_getrpcbyname_r): Likewise.
	* nis/nss_nisplus/nisplus-netgrp.c (_nss_nisplus_getnetgrent_r):
	Likewise.
	* nis/ypclnt.c (yp_match): Likewise.
	(yp_first): Likewise.
	(yp_next): Likewise.
	(yp_master): Likewise.
	(yp_order): Likewise.
	* nscd/hstcache.c (cache_addhst): Likewise.
	* nscd/initgrcache.c (addinitgroupsX): Likewise.
	* nss/nss_compat/compat-pwd.c (copy_pwd_changes): Likewise.
	(internal_getpwuid_r): Likewise.
	* nss/nss_compat/compat-spwd.c (copy_spwd_changes): Likewise.
	* posix/glob.h (__GLOB_FLAGS): Likewise.
	* posix/regcomp.c (peek_token): Likewise.
	(peek_token_bracket): Likewise.
	(parse_expression): Likewise.
	* posix/regexec.c (sift_states_iter_mb): Likewise.
	(check_node_accept_bytes): Likewise.
	* posix/tst-spawn3.c (do_test): Likewise.
	* posix/wordexp-test.c (testit): Likewise.
	* posix/wordexp.c (parse_tilde): Likewise.
	(exec_comm): Likewise.
	* posix/wordexp.h (__WRDE_FLAGS): Likewise.
	* resource/vtimes.c (TIMEVAL_TO_VTIMES): Likewise.
	* setjmp/sigjmp.c (__sigjmp_save): Likewise.
	* stdio-common/printf_fp.c (__printf_fp_l): Likewise.
	* stdio-common/tst-fileno.c (do_test): Likewise.
	* stdio-common/vfprintf-internal.c (vfprintf): Likewise.
	* stdlib/strfmon_l.c (__vstrfmon_l_internal): Likewise.
	* stdlib/strtod_l.c (round_and_return): Likewise.
	(____STRTOF_INTERNAL): Likewise.
	* stdlib/tst-strfrom.h (TEST_STRFROM): Likewise.
	* string/strcspn.c (STRCSPN): Likewise.
	* string/test-memmem.c (simple_memmem): Likewise.
	* termios/tcsetattr.c (tcsetattr): Likewise.
	* time/alt_digit.c (_nl_parse_alt_digit): Likewise.
	* time/asctime.c (asctime_internal): Likewise.
	* time/strptime_l.c (__strptime_internal): Likewise.
	* time/sys/time.h (timercmp): Likewise.
	* time/tzfile.c (__tzfile_compute): Likewise.
2019-02-22 01:32:36 +00:00
Aurelien Jarno
7395f3a0ef en_US: define date_fmt (bug 24046)
The en_US locale use a 12h am/pm format in both d_fmt and d_t_fmt, which
is correct, but does not define date_fmt. This causes the default value
to be used, which is in 24h format.

This patch adds the date_fmt entry to the en_US locale with the same
value as d_t_fmt as the latter already includes the timezone.

Changelog
	[BZ #24046]
	* localedata/locales/en_US (date_fmt): Add, set to
	"%a %d %b %Y %r %Z".
2019-01-07 14:51:13 +01:00
PanderMusubi
4d7d7dc6fe bs_BA: Fix a small typo in comment (bug 24011).
[BZ #24011]
	* localedata/locales/bs_BA (LC_TELEPHONE): Fix a typo in comment.
2019-01-02 23:50:49 +01:00
Joseph Myers
04277e02d7 Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates
	using scripts/update-copyrights.
	* locale/programs/charmap-kw.h: Regenerated.
	* locale/programs/locfile-kw.h: Likewise.
2019-01-01 00:11:28 +00:00
Rafal Luzynski
989182c40a Multiple locales: Use the correct 12-hour time formats (bug 10496).
It has been discovered that some locales use the 12-hour time formats but
do not use any AM/PM indicator thus making the time ambiguous.  This
commit adds "%p" wherever it was missing.  In some cases it has been
identified that a locale should use 24-hour time format rather than
12-hour.  All time formats come from CLDR but this commit introduces as
few changes as possible (for example, it tries not to change the time zone
display).  For the locales which are not supported by CLDR the consistency
with similar locales (which means the same language or the same country)
has been preserved: if the time formats were the same before the change
then they are still the same after the change.

The time format updates can be roughly summarized as follows:

* Most of the locales of Djibouti, Eritrea, and Ethiopia now use
"%l:%M:%S %p".
* Most of the locales of India and some surrounding countries (Bangladesh,
Nepal etc.) now use "%I:%M:%S %p %Z".
* Most of the Arabic locales now use "%Z %I:%M:%S %p".
* Ge'ez language (Eritrea and Ethiopia) now uses "%l:%M:%S፡%p" (note the
consistent use of Ethiopic wordspace character).
* Tamil (India) now uses "%p %I:%M:%S %Z".
* Chinese (Hong Kong) t_fmt now uses "%p %I<U6642>%M<U5206>%S<U79D2> %Z".
* Additionally, the following locales have been switched from 12-hour time
formats to 24-hour, according to CLDR: Arabic (Morocco), Maltese, Somali
(Kenya), and Tamil (Sri Lanka).
* Finally, the Bulgarian, Czech, and Slovak locales used 24-hour time
format correctly but their t_fmt_ampm field was not empty containing
12-hour time format which was incorrect so it is now replaced with an
empty string.

	[BZ #10496]
	* localedata/locales/aa_DJ (t_fmt): Set to "%l:%M:%S %p".
	(t_fmt_ampm): Likewise.
	* localedata/locales/aa_ER (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/aa_ER@saaho (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/aa_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/am_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/byn_ER (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/om_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/sid_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/so_DJ (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/so_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/so_SO (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/ti_ER (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/ti_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/tig_ER (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/wal_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.

	* localedata/locales/anp_IN (t_fmt): Set to "%I:%M:%S %p %Z".
	* localedata/locales/ar_IN (t_fmt): Likewise.
	* localedata/locales/bhb_IN (t_fmt): Likewise.
	* localedata/locales/bho_IN (t_fmt): Likewise.
	* localedata/locales/bi_VU (t_fmt): Likewise.
	* localedata/locales/bn_BD (t_fmt): Likewise.
	* localedata/locales/bn_IN (t_fmt): Likewise.
	* localedata/locales/brx_IN (t_fmt): Likewise.
	* localedata/locales/doi_IN (t_fmt): Likewise.
	* localedata/locales/en_HK (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.
	* localedata/locales/en_IN (t_fmt): Likewise.
	* localedata/locales/en_PH (t_fmt): Likewise.
	* localedata/locales/gu_IN (t_fmt): Likewise.
	* localedata/locales/hi_IN (t_fmt): Likewise.
	* localedata/locales/hif_FJ (t_fmt): Likewise.
	* localedata/locales/hne_IN (t_fmt): Likewise.
	* localedata/locales/kn_IN (t_fmt): Likewise.
	* localedata/locales/kok_IN (t_fmt): Likewise.
	* localedata/locales/ks_IN (t_fmt): Likewise.
	* localedata/locales/ks_IN@devanagari (t_fmt): Likewise.
	* localedata/locales/mag_IN (t_fmt): Likewise.
	* localedata/locales/mai_IN (t_fmt): Likewise.
	* localedata/locales/mjw_IN (t_fmt): Likewise.
	* localedata/locales/ml_IN (t_fmt): Likewise.
	* localedata/locales/mni_IN (t_fmt): Likewise.
	* localedata/locales/mr_IN (t_fmt): Likewise.
	* localedata/locales/ms_MY (t_fmt): Likewise.
	* localedata/locales/pa_IN (t_fmt): Likewise.
	* localedata/locales/raj_IN (t_fmt): Likewise.
	* localedata/locales/sa_IN (t_fmt): Likewise.
	* localedata/locales/sat_IN (t_fmt): Likewise.
	* localedata/locales/sd_IN (t_fmt): Likewise.
	* localedata/locales/sd_IN@devanagari (t_fmt): Likewise.
	* localedata/locales/tcy_IN (t_fmt): Likewise.
	* localedata/locales/the_NP (t_fmt): Likewise.
	* localedata/locales/to_TO (t_fmt): Likewise.
	* localedata/locales/ur_IN (t_fmt): Likewise.

	* localedata/locales/hif_FJ (d_t_fmt): Set to
	"%A %d %b %Y %I:%M:%S %p".
	(date_fmt): Add, set to "%A %d %b %Y %I:%M:%S %p %Z".

	* localedata/locales/ar_AE (t_fmt): Set to "%Z %I:%M:%S %p".
	* localedata/locales/ar_BH (t_fmt): Likewise.
	* localedata/locales/ar_DZ (t_fmt): Likewise.
	* localedata/locales/ar_EG (t_fmt): Likewise.
	* localedata/locales/ar_IQ (t_fmt): Likewise.
	* localedata/locales/ar_JO (t_fmt): Likewise.
	* localedata/locales/ar_KW (t_fmt): Likewise.
	* localedata/locales/ar_LB (t_fmt): Likewise.
	* localedata/locales/ar_LY (t_fmt): Likewise.
	* localedata/locales/ar_OM (t_fmt): Likewise.
	* localedata/locales/ar_QA (t_fmt): Likewise.
	* localedata/locales/ar_SD (t_fmt): Likewise.
	* localedata/locales/ar_SS (t_fmt): Likewise.
	* localedata/locales/ar_SY (t_fmt): Likewise.
	* localedata/locales/ar_TN (t_fmt): Likewise.
	* localedata/locales/ar_YE (t_fmt): Likewise.

	* localedata/locales/gez_ER (t_fmt): Set to "%l:%M:%S<U1361>%p".
	(t_fmt_ampm): Likewise.
	* localedata/locales/gez_ET (t_fmt): Likewise.
	(t_fmt_ampm): Likewise.

	* localedata/locales/ta_IN (t_fmt): Set to "%p %I:%M:%S %Z".
	(t_fmt_ampm): Likewise.
	(d_t_fmt): Set to "%A %d %B %Y %p %I:%M:%S %Z".

	* localedata/locales/zh_HK (t_fmt):
	Set to "%p %I<U6642>%M<U5206>%S<U79D2> %Z".

	* localedata/locales/ar_MA (t_fmt_ampm): Set to "" (empty string)
	because this locale does not use the 12-hour clock.
	(t_fmt): Set to "%Z %H:%M:%S".
	(d_t_fmt): Set to "%d %b, %Y %Z %H:%M:%S".

	* localedata/locales/mt_MT (t_fmt_ampm): Set to "" (empty string)
	because this locale does not use the 12-hour clock.
	(t_fmt): Set to "%H:%M:%S %Z".
	(d_t_fmt): Set to "%A, %d ta %b, %Y %H:%M:%S %Z".

	* localedata/locales/so_KE (t_fmt_ampm): Set to "" (empty string)
	because this locale does not use the 12-hour clock.
	(t_fmt): Set to "%T".
	(d_t_fmt): Set to "%A, %B %e, %Y %X %Z".
	(date_fmt): Set to "%A, %B %e, %X %Z %Y".

	* localedata/locales/ta_LK (t_fmt_ampm): Set to "" (empty string)
	because this locale does not use the 12-hour clock.
	(t_fmt): Set to "%H:%M:%S %Z".
	(d_t_fmt): Set to "%A %d %B %Y %H:%M:%S %Z".

	* localedata/locales/bg_BG (t_fmt_ampm): Set to "" (empty string)
	because this locale does not use the 12-hour clock.
	* localedata/locales/cs_CZ (t_fmt_ampm): Likewise.
	* localedata/locales/sk_SK (t_fmt_ampm): Likewise.
2018-12-28 21:56:18 +01:00
Rafal Luzynski
27841a7d5a sq_AL: Use the correct date and time formats (bug 10496, 23724).
Albanian locale uses the 12-hour clock but some time formats did not
use any AM/PM indicator making the time ambiguous.  This commit adds
"%p" wherever it was missing.

It also sets the correct date format because the old "%Y-%b-%d" produced
rather weird results like "2018-Sht-28".

All time formats come from CLDR but as few changes have been introduced
by this commit as possible.  Some articles from MSDN and other available
online sources have been also taken into account.

	[BZ #10496]
	[BZ #23724]
	* localedata/locales/sq_AL (t_fmt): Set to "%I:%M:%S.%p %Z".
	(t_fmt_ampm): Likewise.
	(d_t_fmt): Set to "%a %-d %b %Y %I:%M:%S.%p".
	(date_fmt): Add, set to "%a %-d %b %Y %I:%M:%S.%p %Z".
	(d_fmt): Set to "%-d.%-m.%y".
2018-12-28 21:45:27 +01:00
Florian Weimer
40e6c1ec1f localedata: Remove executable bit from localedata/locales/bi_VU [BZ #23995] 2018-12-18 10:56:21 +01:00
Carlos O'Donell
8cebd4ffe6 Add --no-hard-links option to localedef (bug 23923)
Downstream distributions need consistent sets of hardlinks in
order for rpm to operate effectively. This means that even if
locales are built with a high level of parallelism that the
resulting files need to have consistent hardlink counts. The only
way to achieve this is with a post-install hardlink pass using a
program like 'hardlink' (shipped in Fedora).

If the downstream distro wants to post-process the hardlinks then
the time spent in localedef looking up sibling directories and
processing hardlinks is wasted effort.

To optimize the build and install pass we add a --no-hard-links
option to localedef to avoid doing the hardlink optimziation for
size.

Tested on x86_64 with 'make localedata/install-locale-files'
before and after. Without the patch we have files with 100+
hardlink counts. After the patch and running with --no-hard-links
all link counts are 1. This patch also alters the convenience
target 'make localedata/install-locale-files' to use the new
option.

Signed-off-by: Carlos O'Donell <carlos@redhat.com>
2018-12-03 10:15:39 -05:00
Sergi Almacellas Abellana
fec8bb7ca9 Currency symbol should not preceed amount for [BZ #23791]
CLDR also has the currency symbol after the amount for Catalan.

Also set grouping in LC_NUMERIC to 3;3.

Reviewed-by: Mike FABIAN <mfabian@redhat.com>
2018-10-29 19:23:11 +01:00
Andreas Schwab
ce5a7de6cd Don't reduce test timeout to less than default
This removes all overrides of TIMEOUT that are less than or equal to the
default timeout.
2018-10-17 09:34:13 +02:00
Rafal Luzynski
a68ec8eac2 kl_GL: Update the month names and date formats (bug 23740).
Month names as provided by Oqaasileriffik, the official Greenlandic
language regulator.  They have recently reached the consensus regarding
the orthography of the month names.

Date formats updated to match the correct Greenlandic order which is MDY.

	[BZ #23740]
	* localedata/locales/kl_GL (mon): Update, the relative case.
	(alt_mon): Add, fill with month names in the nominative case.
	(d_t_fmt): Set to "%a %b %d %Y %T %Z".
	(d_fmt): Set to "%b %d %Y".
2018-10-08 12:28:02 +02:00
Rafal Luzynski
dae3ed958c kl_GL: Fix spelling of Sunday, should be "sapaat" (bug 20209).
Although CLDR says otherwise, it is confirmed by Oqaasileriffik, the
official Greenlandic language regulator, that this change is correct.

	[BZ #20209]
	* localedata/locales/kl_GL: (abday): Fix spelling of Sun (Sunday),
	should be "sap" rather than "sab".
	(day): Fix spelling of Sunday, should be "sapaat" rather than
	"sabaat".
2018-10-02 23:55:13 +02:00
Rafal Luzynski
434d45fd70 it_CH/it_IT locales: Correct some LC_TIME formats (bug 10425).
Synchronize some values with CLDR and apply some suggestions from Bugzilla.

	[BZ #10425]
	* localedata/locales/it_IT (d_t_fmt): Use "%a %-d %b %Y, %T".
	(date_fmt): Use "%a %-d %b %Y, %T, %Z".
	* localedata/locales/it_CH (d_t_fmt): Use "%a %-d %b %Y, %T"
	which is the same as in it_IT.
	(d_fmt): Use "%d.%m.%Y" which is the same as in de_CH.
	(date_fmt): Use "%a %-d %b %Y, %T, %Z" which is the same as in it_IT.
2018-09-21 10:40:20 +02:00
Rafal Luzynski
527f355e5e Italian and Swiss locales: Use the correct separators (bug 10797).
CLDR and many other sources say that it_IT (Italian) should use a dot
(".") as a thousands separator and a comma (",") as a decimal separator.

For it_CH and de_CH CLDR says that they should use the Right Single
Quotation Mark ("’") as a thousands separator and a dot (".") as a
decimal separator.  Consequently, the same rules are copied to all other
locales in Switzerland.

These rules apply to both LC_MONETARY and LC_NUMERIC.

	[BZ #10797]
	* localedata/locales/de_CH (mon_thousands_sep): Use "<U2019>" (Right
	Single Quotation Mark).
	(thousands_sep): Likewise.
	* localedata/locales/it_CH (LC_NUMERIC): Use “copy "de_CH"”.
	* localedata/locales/it_IT (thousands_sep): Use ".".
	(grouping): Use "3;3".
2018-09-10 23:56:53 +02:00
Rafal Luzynski
a33650d1a6 Indian and similar locales: Set the correct date format (bug 17426).
This commit also fixes d_fmt in bn_BD which is identical to bn_IN,
in ne_NP which is identical to ne_IN (not supported by Glibc but supported
by CLDR), and in ta_LK which is identical to ta_IN.

For those locales which are supported by CLDR data is imported from
CLDR v33.  For others it is copied from those locales which were identical
before this commit.

	[BZ #17426]
	* localedata/locales/anp_IN (d_fmt): Use "%-d//%-m//%y".
	* localedata/locales/ar_IN (d_fmt): Likewise.
	* localedata/locales/bhb_IN (d_fmt): Likewise.
	* localedata/locales/bho_IN (d_fmt): Likewise.
	* localedata/locales/bn_BD (d_fmt): Likewise.
	* localedata/locales/bn_IN (d_fmt): Likewise.
	* localedata/locales/doi_IN (d_fmt): Likewise.
	* localedata/locales/gu_IN (d_fmt): Likewise.
	* localedata/locales/hi_IN (d_fmt): Likewise.
	* localedata/locales/hne_IN (d_fmt): Likewise.
	* localedata/locales/kn_IN (d_fmt): Likewise.
	* localedata/locales/mag_IN (d_fmt): Likewise.
	* localedata/locales/mai_IN (d_fmt): Likewise.
	* localedata/locales/mjw_IN (d_fmt): Likewise.
	* localedata/locales/ml_IN (d_fmt): Likewise.
	* localedata/locales/mni_IN (d_fmt): Likewise.
	* localedata/locales/mr_IN (d_fmt): Likewise.
	* localedata/locales/pa_IN (d_fmt): Likewise.
	* localedata/locales/raj_IN (d_fmt): Likewise.
	* localedata/locales/sat_IN (d_fmt): Likewise.
	* localedata/locales/sd_IN (d_fmt): Likewise.
	* localedata/locales/sd_IN@devanagari (d_fmt): Likewise.
	* localedata/locales/ta_IN (d_fmt): Likewise.
	* localedata/locales/ta_LK (d_fmt): Likewise.
	* localedata/locales/tcy_IN (d_fmt): Likewise.
	* localedata/locales/ur_IN (d_fmt): Likewise.

	* localedata/locales/brx_IN (d_fmt): Use "%-m//%-d//%y".
	* localedata/locales/ks_IN (d_fmt): Likewise.
	* localedata/locales/ks_IN@devanagari (d_fmt): Likewise.

	* localedata/locales/kok_IN (d_fmt): Use "%-d-%-m-%y".
	* localedata/locales/ne_NP (d_fmt): Use "%y//%-m//%-d".
	* localedata/locales/sa_IN (d_fmt): Use "%-d-%m-%y".
	* localedata/locales/te_IN (d_fmt): Use "%d-%m-%y".
2018-09-05 23:57:11 +02:00
Rafal Luzynski
5abedf97a3 en_IN: Set the correct date format for "%x" (bug 17426).
[BZ #17426]
	* localedata/locales/en_IN (d_fmt): Use "%d/%m/%y".
2018-08-28 00:21:52 +02:00
Carlos O'Donell
08a5ee14c6 Add convenience target 'install-locale-files'.
The convenience install target 'install-locale-files' is created
to allow distributions to install all of the SUPPORTED locales as
files instead of into the locale-archive.

You invoke the new convenience target like this:
make localedata/install-locale-files DESTDIR=<prefix>
2018-08-02 15:31:12 -04:00
Carlos O'Donell
49dddc3e99 Add missing localedata/en_US.UTF-8.in (Bug 23393).
Commit 7cd7d36f1f failed to include
the new testing file en_US.UTF-8.in.
2018-07-25 21:58:10 -04:00
Carlos O'Donell
7cd7d36f1f Keep expected behaviour for [a-z] and [A-z] (Bug 23393).
In commit 9479b6d5e0 we updated all of
the collation data to harmonize with the new version of ISO 14651
which is derived from Unicode 9.0.0.  This collation update brought
with it some changes to locales which were not desirable by some
users, in particular it altered the meaning of the
locale-dependent-range regular expression, namely [a-z] and [A-Z], and
for en_US it caused uppercase letters to be matched by [a-z] for the
first time.  The matching of uppercase letters by [a-z] is something
which is already known to users of other locales which have this
property, but this change could cause significant problems to en_US
and other similar locales that had never had this change before.
Whether this behaviour is desirable or not is contentious and GNU Awk
has this to say on the topic:
https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html
While the POSIX standard also has this further to say: "RE Bracket
Expression":
http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html
"The current standard leaves unspecified the behavior of a range
expression outside the POSIX locale. ... As noted above, efforts were
made to resolve the differences, but no solution has been found that
would be specific enough to allow for portable software while not
invalidating existing implementations."
In glibc we implement the requirement of ISO POSIX-2:1993 and use
collation element order (CEO) to construct the range expression, the
API internally is __collseq_table_lookup().  The fact that we use CEO
and also have 4-level weights on each collation rule means that we can
in practice reorder the collation rules in iso14651_t1_common (the new
data) to provide consistent range expression resolution *and* the
weights should maintain the expected total order.  Therefore this
patch does three things:

* Reorder the collation rules for the LATIN script in
  iso14651_t1_common to deinterlace uppercase and lowercase letters in
  the collation element orders.

* Adds new test data en_US.UTF-8.in for sort-test.sh which exercises
  strcoll* and strxfrm* and ensures the ISO 14651 collation remains.

* Add back tests to tst-fnmatch.input and tst-regexloc.c which
  exercise that [a-z] does not match A or Z.

The reordering of the ISO 14651 data is done in an entirely mechanical
fashion using the following program attached to the bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c28

It is up for discussion if the iso14651_t1_common data should be
refined further to have 3 very tight collation element ranges that
include only a-z, A-Z, and 0-9, which would implement the solution
sought after in:
https://sourceware.org/bugzilla/show_bug.cgi?id=23393#c12
and implemented here:
https://www.sourceware.org/ml/libc-alpha/2018-07/msg00854.html

No regressions on x86_64.
Verified that removal of the iso14651_t1_common change causes tst-fnmatch
to regress with:
422: fnmatch ("[a-z]", "A", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
...
425: fnmatch ("[A-Z]", "z", 0) = 0 (FAIL, expected FNM_NOMATCH) ***
2018-07-25 17:00:45 -04:00
Quentin PAGÈS
df467d229a oc_FR locale: Multiple updates (bug 23140, bug 23422).
Multiple updates for Occitan language including alternative month names,
update abday and abmon, fix typos in day, fix d_fmt, correct LC_NAME,
and use “copy "ca_ES"” as LC_COLLATE.

	[BZ #23140]
	* localedata/locales/oc_FR (mon): Rename to...
	(alt_mon): This, then update October (typo fix).
	(mon): New content (genitive case, month names preceded by
	"de" or "d’").

	[BZ #23422]
	* localedata/locales/oc_FR (abday): Update all items.
	(day): Update Wednesday and Saturday (typo fixes).
	(abmon): Update all items, except May.
	(d_fmt): Update "%d.%m.%Y" -> "%d/%m/%Y".
	(LC_IDENTIFICATION): Bump the revision number and date.
	Keep the "category" entries in alphabetic order.
	(LC_ADDRESS): Remove no longer needed comment.
	(LC_COLLATE): Use “copy "ca_ES"”.
	(LC_NAME): Set the correct values of "name_fmt", "name_mr", and
	"name_mrs".

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-07-18 23:17:17 +02:00
Valery Timiriliyev
61c4aad705 New locale: Yakut (Sakha) for Russia (sah_RU) [BZ #22241]
* localedata/Makefile (test-input): Add sah_RU.UTF-8.
	(LOCALES): Likewise.
	* localedata/SUPPORTED (sah_RU/UTF-8): New entry.
	* localedata/locales/sah_RU: New file.
	* localedata/sah_RU.UTF-8.in: New file.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-07-18 11:45:44 +02:00
Rafal Luzynski
9145f0333d os_RU: Add alternative month names (bug 23140).
[BZ #23140]
	* localedata/locales/os_RU (mon): Rename to...
	(alt_mon): This.
	(mon): Import from CLDR (genitive case).

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2018-07-17 23:58:56 +02:00
Rafal Luzynski
0a83bad2aa dsb_DE locale: Fix syntax error and add tests (bug 23208).
Fixed syntax error in the collation rules of Lower Sorbian language.
Collation test added in order to test the bugs like this early.

Reported-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

	[BZ #23208]
	* localedata/Makefile (test-input): Add dsb_DE.UTF-8.
	(LOCALES): Likewise.
	* localedata/dsb_DE.UTF-8.in: New file.
	* localedata/locales/dsb_DE (LC_COLLATE): Fix syntax error.
2018-07-13 23:06:32 +02:00
Mike FABIAN
4beefeeb8e Put the correct Unicode version number 11.0.0 into the generated files
In some places there was still the old Unicode version 10.0.0 in the files.

	* localedata/charmaps/UTF-8: Use correct Unicode version 11.0.0 in comment.
	* localedata/locales/i18n_ctype: Use correct Unicode version in comments
	and headers.
	* localedata/unicode-gen/utf8_gen.py: Add option to specify Unicode version
	* localedata/unicode-gen/Makefile: Use option to specify Unicode version
	for utf8_gen.py
2018-07-10 17:30:31 +02:00