Mike FABIAN
1597385481
Adapt collation in several locales to the new iso14651_t1_common file
...
[BZ #22550 ] - es_ES locale (and other es_* locales): collation should
treat ñ as a primary different character, sync the collation
for Spanish with CLDR
[BZ #21547 ] - Tibetan script collation broken (Dzongkha and Tibetan)
* localedata/Makefile: Add new test files.
* localedata/lv_LV.UTF-8.in: Adapt test file to new collation order.
* localedata/sv_SE.ISO-8859-1.in: Adapt test file to new collation order.
* localedata/uk_UA.UTF-8.in: Adapt test file to new collation order.
* localedata/am_ET.UTF-8.in: New test file.
* localedata/az_AZ.UTF-8.in: Likewise.
* localedata/be_BY.UTF-8.in: Likewise.
* localedata/ber_DZ.UTF-8.in: Likewise.
* localedata/ber_MA.UTF-8.in: Likewise.
* localedata/bg_BG.UTF-8.in: Likewise.
* localedata/br_FR.UTF-8.in: Likewise.
* localedata/cmn_TW.UTF-8.in: Likewise.
* localedata/crh_UA.UTF-8.in: Likewise.
* localedata/csb_PL.UTF-8.in: Likewise.
* localedata/cv_RU.UTF-8.in: Likewise.
* localedata/cy_GB.UTF-8.in: Likewise.
* localedata/dz_BT.UTF-8.in: Likewise.
* localedata/eo.UTF-8.in: Likewise.
* localedata/es_ES.UTF-8.in: Likewise.
* localedata/fa_IR.UTF-8.in: Likewise.
* localedata/fi_FI.UTF-8.in: Likewise.
* localedata/fil_PH.UTF-8.in: Likewise.
* localedata/fur_IT.UTF-8.in: Likewise.
* localedata/gez_ER.UTF-8@abegede.in: Likewise.
* localedata/ha_NG.UTF-8.in: Likewise.
* localedata/ig_NG.UTF-8.in: Likewise.
* localedata/ik_CA.UTF-8.in: Likewise.
* localedata/kk_KZ.UTF-8.in: Likewise.
* localedata/ku_TR.UTF-8.in: Likewise.
* localedata/ky_KG.UTF-8.in: Likewise.
* localedata/ln_CD.UTF-8.in: Likewise.
* localedata/mi_NZ.UTF-8.in: Likewise.
* localedata/ml_IN.UTF-8.in: Likewise.
* localedata/mn_MN.UTF-8.in: Likewise.
* localedata/mr_IN.UTF-8.in: Likewise.
* localedata/mt_MT.UTF-8.in: Likewise.
* localedata/nb_NO.UTF-8.in: Likewise.
* localedata/om_KE.UTF-8.in: Likewise.
* localedata/os_RU.UTF-8.in: Likewise.
* localedata/ps_AF.UTF-8.in: Likewise.
* localedata/ro_RO.UTF-8.in: Likewise.
* localedata/ru_RU.UTF-8.in: Likewise.
* localedata/sc_IT.UTF-8.in: Likewise.
* localedata/se_NO.UTF-8.in: Likewise.
* localedata/sq_AL.UTF-8.in: Likewise.
* localedata/sv_SE.UTF-8.in: Likewise.
* localedata/szl_PL.UTF-8.in: Likewise.
* localedata/tg_TJ.UTF-8.in: Likewise.
* localedata/tk_TM.UTF-8.in: Likewise.
* localedata/tt_RU.UTF-8.in: Likewise.
* localedata/tt_RU.UTF-8@iqtelif.in: Likewise.
* localedata/ug_CN.UTF-8.in: Likewise.
* localedata/uz_UZ.UTF-8.in: Likewise.
* localedata/vi_VN.UTF-8.in: Likewise.
* localedata/yi_US.UTF-8.in: Likewise.
* localedata/yo_NG.UTF-8.in: Likewise.
* localedata/zh_CN.UTF-8.in: Likewise.
* localedata/locales/am_ET: Adapt collation rules to new iso14651_t1_common
file and fix bugs in the collation.
* localedata/locales/az_AZ: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/br_FR: Likewise.
* localedata/locales/br_FR@euro: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/cns11643_stroke: Likewise.
* localedata/locales/crh_UA: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/csb_PL: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/en_CA: Likewise.
* localedata/locales/eo: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_EC: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/es_US: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fil_PH: Likewise.
* localedata/locales/fur_IT: Likewise.
* localedata/locales/gez_ER@abegede: Likewise.
* localedata/locales/ha_NG: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/hsb_DE: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/ig_NG: Likewise.
* localedata/locales/ik_CA: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/iso14651_t1_pinyin: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/ku_TR: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mi_NZ: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/mn_MN: Likewise.
* localedata/locales/mr_IN: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/ps_AF: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/ru_UA: Likewise.
* localedata/locales/sc_IT: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/sv_FI: Likewise.
* localedata/locales/sv_FI@euro: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/szl_PL: Likewise.
* localedata/locales/tg_TJ: Likewise.
* localedata/locales/ti_ER: Likewise.
* localedata/locales/tk_TM: Likewise.
* localedata/locales/tl_PH: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/tt_RU: Likewise.
* localedata/locales/tt_RU@iqtelif: Likewise.
* localedata/locales/ug_CN: Likewise.
* localedata/locales/uk_UA: Likewise.
* localedata/locales/uz_UZ: Likewise.
* localedata/locales/uz_UZ@cyrillic: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yo_NG: Likewise.
2018-02-27 17:47:50 +01:00
Mike FABIAN
df74ef786f
Add sections for various scripts to the iso14651_t1_common file
...
* localedata/locales/iso14651_t1_common: Add sections for various
scripts to the iso14651_t1_common file.
2018-02-27 16:52:54 +01:00
Mike FABIAN
d5adfbadd4
iso14651_t1_common: make the fourth level the codepoint for characters which are ignorable on all 4 levels
...
Entries for characters which have “IGNORE” on all 4 levels like:
<U0001> IGNORE;IGNORE;IGNORE;IGNORE % START OF HEADING (in ISO 6429)
are changed into:
<U0001> IGNORE;IGNORE;IGNORE;<U0001> % START OF HEADING (in ISO 6429)
i.e. putting the code point of the character into the fourth level
instead of “IGNORE”. Without that change, all such characters
would compare equal which would make a wcscoll test case fail.
It is better to have a clearly defined sort order even for characters
like this so it is good to use the code point as a tie-break.
* localedata/locales/iso14651_t1_common: Use the code point of a
character in the fourth collation level instead of IGNORE for all
entries which have IGNORE on all 4 levels.
2018-02-27 16:50:30 +01:00
Mike FABIAN
5f5a961091
Add convenience symbols like <AFTER-A>, <BEFORE-A> to iso14651_t1_common
...
* localedata/locales/iso14651_t1_common: Add some convenient collation
symbols like <AFTER-A>, <BEFORE-A> to make tailoring easier using
rules similar to those in CLDR.
2018-02-27 16:47:22 +01:00
Mike FABIAN
8a97e9002f
Fixing syntax errors after updating the iso14651_t1_common file
...
* localedata/locales/iso14651_t1_common: The new version of this
file downloaded from ISO contained several syntax errors which
are fixed by this patch.
2018-02-27 16:45:30 +01:00
Mike FABIAN
bbdd2fba7d
iso14651_t1_common: <U\([0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]\)> → <U000\1>
...
* localedata/locales/iso14651_t1_common: replace all <U.....>
with <U000.....> because glibc understands only 4 digit or 8 digit
2018-02-27 16:44:03 +01:00
Mike FABIAN
1569e551af
Necessary changes after updating the iso14651_t1_common file
...
* localedata/locales/iso14651_t1_common: Necessary changes
to make the file downloaded from ISO usable by glibc.
2018-02-27 16:42:14 +01:00
Mike FABIAN
9479b6d5e0
Update iso14651_t1_common file to ISO14651_2016_TABLE1_en.txt [BZ #14095 ]
...
[BZ #14095 ] - Review / update collation data from Unicode / ISO 14651
File downloaded from:
http://standards.iso.org/iso-iec/14651/ed-4/ISO14651_2016_TABLE1_en.txt
Updating this file alone is not enough, there are problems in the new
file which need to be fixed and the collation rules for many locales
need to be adapted. This is done by the following patches.
This update also fixes the problem that many characters are treated as
identical when sorting because they were not yet in the old
iso14651_t1_common file, see:
https://bugzilla.redhat.com/show_bug.cgi?id=1336308
- Infinite (∞) and empty set (∅) are treated as if they were the same character by sort and uniq
[BZ #14095 ]
* localedata/locales/iso14651_t1_common: Update file to
latest version from ISO (ISO14651_2016_TABLE1_en.txt).
2018-02-27 16:36:31 +01:00
Alexandre Oliva
8da25eec0a
Collation fix: make forward accent sorting the default [BZ #17750 ]
...
[BZ #17750 ]
* Makefile: add fr_CA.UTF-8 to test-input and LOCALES.
* localedata/fr_CA.UTF-8.in: New file with test data for backward
accents sorting.
* localedata/fr_FR.UTF-8.in: Fix test data for forward accents
sorting.
* localedata/locales/cs_CZ (LC_COLLATE): Remove “define DIACRIT_FORWARD”
* localedata/locales/de_DE (LC_COLLATE): Likewise.
* localedata/locales/hu_HU (LC_COLLATE): Likewise.
* localedata/locales/lb_LU (LC_COLLATE): Likewise.
* localedata/locales/yuw_PG (LC_COLLATE): Likewise.
* localedata/locales/fr_CA (LC_COLLATE): Add “define DIACRIT_BACKWARD”
* localedata/locales/iso14651_t1_common: Use “ifdef DIACRIT_FORWARD”
instead of “ifdef DIACRIT_BACKWARD”.
The only locale which currently needs backward accents sorting is fr_CA.
Therefore, forward accents sorting should be the default.
Before this patch, backwards accent sorting was the default and all
locales except fr_CA had to use
define DIACRIT_FORWARD
before
copy "iso14651_t1"
Most locales didn’t do that and thus got the inappropriate backwards accents sorting
by accident. Now only the fr_CA locale needs to use
define DIACRIT_BACKWARD
before
copy "iso14651_t1"
Original patch slightly modified by: Mike FABIAN <mfabian@redhat.com>
2017-11-29 11:56:46 +01:00
Santhosh Thottingal
b05eca0e1d
Correct collation rules for Malayalam.
...
[BZ #19922 ]
* locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF.
[BZ #19919 ]
* locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37.
2017-06-11 10:08:37 -04:00
Mike Frysinger
a4cea54b12
localedata: standardize copyright/license information [BZ #11213 ]
...
Use the language from the FSF in all locale files to disclaim any
license/copyright on locale data.
See https://sourceware.org/ml/libc-locales/2013-q1/msg00048.html
2016-03-21 02:29:56 -04:00
Ulrich Drepper
b426c80f5f
Fix whitespaces
2011-05-15 11:37:52 -04:00
Ulrich Drepper
08ba84136f
Move Dzonghka collation rules to common collation rules file
2011-05-15 11:36:07 -04:00
Pravin Satpute
1e5e9ec825
Fix sorting of malayalam letter 'na'.
2010-02-03 03:50:01 -08:00
Ulrich Drepper
6b4f51823c
Fix whitespaces.
2010-02-03 03:36:52 -08:00
Pravin Satpute
3e8a75d1b9
Move Tamil collation data to common source file.
2010-02-03 03:32:06 -08:00
Keith Stribley
3c2c4bf6f7
Implement Burmese language locale for Myanmar.
2009-10-30 08:14:02 -07:00
Ulrich Drepper
115a532734
* localedata/locales/bn_BD: Remove comment about missing collation
...
rules.
* localedata/locales/iso14651_t1_common: Add Bengali collation rules.
Patch by Pravin Satpute <psatpute@redhat.com>.
2009-05-04 21:20:20 +00:00
Ulrich Drepper
eee6b14327
[BZ #9759 ]
...
* dirent/dirent.h: Adjust prototypes of scandir, scandir64, alphasort,
alphasort64, versionsort, and versionsort64 to POSIX 2008.
* dirent/alphasort.c: Adjust implementation to type change.
* dirent/alphasort64.c: Likewise.
* dirent/scandir.c: Likewise.
* dirent/versionsort.c: Likewise.
* dirent/versionsort64.c: Likewise.
* sysdeps/wordsize-64/alphasort.c: Add hack to hide alphasort64
declaration.
* sysdeps/wordsize-64/versionsort.c: Add hack to hide versionsort64
declaration.
2009-03-15 21:33:19 +00:00
Ulrich Drepper
638633961d
* locales/iso14651_t1_common: Add rules for sorting Malayalam.
...
Patch by Santhosh Thottingal <santhosh.thottingal@gmail.com>.
2009-02-11 15:42:53 +00:00
Ulrich Drepper
06057297c4
* locales/iso14651_t1_common: Fix sorting of U+0AB3.
...
Patch by Pravin Satpute <psatpute@redhat.com>.
2008-12-31 14:58:14 +00:00
Ulrich Drepper
6daf1a2fb1
[BZ #6867 ]
...
* sysdeps/powerpc/elf/rtld-global-offsets.sym: Fix typo.
2008-10-31 19:03:31 +00:00
Ulrich Drepper
46026b5589
* locales/iso14651_t1_common: Add Kannada collation support.
...
Patch by Pravin Satpute <psatpute@redhat.com>.
2008-07-11 17:05:42 +00:00
Ulrich Drepper
99ae13c825
* locales/iso14651_t1_common: Add support for Gurumukhi script.
...
Patch by Pravin Satpute <psatpute@redhat.com>.
2008-06-24 16:59:47 +00:00
Ulrich Drepper
e564d29d8e
Remove U0C0D entry added for Telugu.
2008-05-21 15:13:02 +00:00
Ulrich Drepper
74e1338588
* string/strcasestr.c (CMP_FUNC): Use __strncasecmp, not strncasecmp.
2008-05-16 18:19:18 +00:00
Ulrich Drepper
2f9a1be867
[BZ #6442 ]
...
* string/endian.h: Add macros for fixed-size endian conversion.
* bits/byteswap.h: Allow inclusion from <endian.h>.
* sysdeps/i386/bits/byteswap.h: Likewise.
* sysdeps/ia64/bits/byteswap.h: Likewise.
* sysdeps/s390/bits/byteswap.h: Likewise.
* sysdeps/x86_64/bits/byteswap.h: Likewise.
* string/Makefile (tests): Add tst-endian.
* string/tst-endian.c: New file.
2008-05-15 02:54:33 +00:00
Ulrich Drepper
23c37224d3
Fix first weight for U+1E60, U+1E62, U+1E64, U+1E66, and U+1E68.
2008-04-07 23:53:20 +00:00
Ulrich Drepper
4e0b2dbe54
* locales/iso14651_t1_common: Add support for Gujarati script.
...
Patch by Pravin Satpute <psatpute@redhat.com>.
2008-03-31 14:15:28 +00:00
Ulrich Drepper
85ac24138b
* locales/iso14651_t1_common: Add support for Devanagari script.
...
* locales/mr_IN: Adjust Devanagari sorting for mr_IN.
Patch by Pravin Satpute <psatpute@redhat.com>.
2008-03-24 05:08:33 +00:00
Ulrich Drepper
3a054d7ab0
* locale/programs/locfile-token.h: Remove tok_elif, add tok_elifdef
...
and tok_elifndef.
* locale/programs/locfile-kw.gperf: Likewise.
* locale/programs/ld-collate.c: Implement primitive preprocessor.
2007-10-11 02:36:04 +00:00
Ulrich Drepper
592a95ee7c
* po/pt_BR.po: Fix typo.
2007-09-30 16:57:15 +00:00
Ulrich Drepper
762422d1bd
* locale/programs/ld-collate.c (collate_read): Allow order_start
...
after copy.
2007-04-28 06:51:26 +00:00