Commit Graph

660 Commits

Author SHA1 Message Date
Mike Frysinger
6bc81cf205 localedata: standardize first few lines
Purely a style touchup to make sure the headers all look the same.
2016-03-21 02:00:09 -04:00
Mike Frysinger
b07aa58944 localedata: use same comment_char/escape_char in these files
These files are small and easy to convert to what most others use.
2016-03-16 14:59:05 -04:00
Carlos O'Donell
6f915e9dc8 localedata: an_ES: fix case of lang_ab
This needs to be lowercase to match the local ISO 639 database.
2016-03-16 00:54:56 -04:00
Mike Frysinger
5453f739e5 localedata: clear LC_IDENTIFICATION tel/fax fields
These fields aren't terribly useful and most don't set it.
2016-03-05 11:53:23 -05:00
Mike Frysinger
dacc1a23d3 localedata: es_PR: change LC_MEASUREMENT to metric
Puerto Rico uses the metric system and has for a long time.
https://en.wikipedia.org/wiki/Puerto_Rican_units_of_measurement
2016-02-29 15:57:19 -05:00
Mike Frysinger
75aa31de9f localedata: an_ES: fix lang_ab value
Aragonese is classified as "an" so set it.
2016-02-29 15:54:36 -05:00
Mike Frysinger
b6ebba701c locales: pap_AN: delete old/deprecated locale [BZ #16003]
From the bug:
Netherlands Antilles was dissolved, and "AN" is not a part of ISO 3166
anymore.  According to setlocale(3), "territory is an ISO 3166 country
code".  We now have pap_AW and pap_CW.

Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>
2016-02-19 13:43:38 -05:00
Mike Frysinger
d3362b1e3c localedata: CLDRv28: update LC_TELEPHONE.int_prefix
This updates a bunch of locales based on CLDR v28 data:
  ar_SS: int_prefix: changing 249 to 211
  bn_BD: int_prefix: changing 88 to 880
  dz_BT: int_prefix: changing 66 to 975
  en_HK: int_prefix: changing  to 852
  en_PH: int_prefix: changing  to 63
  en_SG: int_prefix: changing  to 65
  es_DO: int_prefix: changing 1809 to 1
  es_PA: int_prefix: changing 502 to 507
  es_PR: int_prefix: changing 1787 to 1
  km_KH: int_prefix: changing 856 to 855
  mt_MT: int_prefix: changing  to 356
  ne_NP: int_prefix: changing 91 to 977
  pap_AW: int_prefix: changing 599 to 297
  the_NP: int_prefix: changing 91 to 977
  tk_TM: int_prefix: changing  to 993
  uz_UZ: int_prefix: changing 27 to 998
  zh_SG: int_prefix: changing  to 65

I've also checked these against https://countrycode.org/.

Note: the Dominican Republic (DO) and Puerto Rico (PR) updates are
correct: they both use +1.  Historically, DO had one area code of
809 and PR of 787 which is why they were listed as such, but they
have both expanded into 829 and 989 respectively, so using the four
digit value is def incorrect now.
2016-02-19 12:46:14 -05:00
Florian Weimer
ff889b1965 Remove trailing newline from date_fmt in Serbian locales [BZ #19581] 2016-02-19 14:21:34 +01:00
Mike Frysinger
3040149d43 localedata: dz_BT/ps_AF: reformat data
ps_AF is the only file that indents fields with tabs.  Kill them.

dz_BT is the only file with a slightly indented field.  Kill that.
2016-02-19 02:54:48 -05:00
Mike Frysinger
b859f89ad6 locledata: trim trailing blank lines/comments
No functional changes, just trying to standardize the format a bit.
2016-02-18 21:34:21 -05:00
Mike Frysinger
cd46e35db1 localedata: convert all files to utf-8
The comments were using various encodings like ISO-8859-1.
Convert them all over to UTF-8.
2016-02-08 23:38:04 -05:00
Evert
812618055e localedata: nl_NL: date_fmt: rewrite to match standards [BZ #16495]
Add some references to public Dutch standards.
2016-01-08 19:13:41 -05:00
Mike Frysinger
a82cd945b5 localedata: nl_NL@euro: copy measurement from nl_NL [BZ #19198]
No real changes here as the output is the same.  Just making the input
a little bit nicer.
2015-12-29 23:19:54 -05:00
Damyan Ivanov
b69b5b3e3e localedata: bg_BG: use colon as time separator [BZ #19385]
The only official source is the "Official spelling dictionary of the
Bulgarian language, Prosveta 2012", which states there are three ways
to separate time components: comma, colon and dot. That same dictionary
doesn't say which one is preferred.

So I turned to the mailing list of the translators of free software in
Bulgarian. The consensus is that colon is the only separator that is
widely used in Bulgarian texts and everything else will just be confusing.

URL: http://lists.ludost.net/pipermail/dict/2015-December/000538.html
2015-12-29 13:49:01 -05:00
Joseph Myers
85bafe6f3d Automate LC_CTYPE generation for tr_TR, update to Unicode 8.0.0 (bug 18491).
This patch makes the automation of Unicode LC_CTYPE generation also
support generating the modified LC_CTYPE used for Turkish (where case
conversions of 'i' and 'I' differ from ASCII conventions), so allowing
that to be more readily kept in sync for future Unicode updates.  The
patch includes the locale update generated by the scripts.

Tested for x86_64.

	[BZ #18491]
	* unicode-gen/unicode_utils.py (to_upper_turkish): New function.
	(to_lower_turkish): Likewise.
	* unicode-gen/gen_unicode_ctype.py (output_tables): Support
	producing output with Turkish case conversions.
	(--turkish): New command-line option.
	* unicode-gen/Makefile (GENERATED): Add tr_TR.
	(tr_TR): New rule.
	* locales/tr_TR: Regenerate LC_CTYPE.
2015-12-11 12:45:19 +00:00
Mike FABIAN
23256f5ed8 Update to Unicode 8.0.0.
Update __STDC_ISO_10646__ to 201505L for Unicode 8.0.0.
Update character encoding, ctype, and transliteration tables.
New scripts autogenerate transliteration tables.
2015-12-10 00:33:48 -05:00
Mike FABIAN
589ac52328 Update da, nb, nn, and sv locales (Bug 89)
Add transliteration rules for da, nb, nn, and sv locales.
2015-12-09 23:08:36 -05:00
Carlos O'Donell
dd8e8e5476 Update transliteration support to Unicode 7.0.0.
The transliteration files are now autogenerated from upstream Unicode
data.
2015-12-09 22:52:13 -05:00
Mike FABIAN
6f84663a4f Generic updates to transliterations.
- Remove duplicate transliterations for U+0152 and U+0153 from
  C-translit.h.in.
- Change Ö U+00D6 LATIN CAPITAL LETTER O WITH STROKE → O
  (instead of → OE)
- Change ö U+00F6 LATIN SMALL LETTER O WITH STROKE → o
  (instead of → oe)
- Add ₹ U+20B9 INDIAN RUPEE SIGN → INR
- Add ₫ U+20AB DONG SIGN → Dong (in addition to "₫ → Đồng")
- Add many others from
  http://unicode.org/cldr/trac/browser/trunk/common/transforms/Latin-ASCII.xml
- Add some more currency signs suggested by Marko Myllynen
- Add another patch with more characters by Marko Myllynen
2015-12-09 21:51:26 -05:00
Gunnar Hjalmarsson
213938ee8a lt_LT: change currency symbol to the euro [BZ #18953]
Lithuania switched currency to the Euro on 1st Jan 2015.
2015-10-17 00:28:13 -04:00
Egmont Koblinger
c7266a2d82 hu_HU: change time separator to colon [BZ #18918]
The previous (11th) version of the Hungarian spelling rules (released
in 1984) said that the separator had to be a dot, e.g. 10.35 meaning
10 o'clock 35 minutes. glibc correctly implements this.

The brand new (12th) version, in effect since September 1, 2015 adopts
to the common use of colon (especially in the digital world) and
allows to use either separator, without even expressing a preference.

For computer systems, using colons is way more typical and probably
easier to recognize. Dot is typically used in printed materials.

It also avoids an almost ambiguous situation where a space makes a
difference, e.g. "10.15-ig" means "until 10 o'clock 15 minutes"
whereas "10. 15-ig" means "until 15th of October". So I believe using
the colon as the separator is not only more frequent in the computer
world, but is also easier and quicker to recognize for the brain that
it's about hour:minute rather than month and day. And luckily it's now
equally correct according to the official rules.

11th edition: http://helyesiras.mta.hu/helyesiras/default/akh11

12th edition: http://helyesiras.mta.hu/helyesiras/default/akh12

In both editions it's the very last (299th and 300th, respectively) rule.

Microsoft also uses and recommends a colon since at least May 2011:
http://download.microsoft.com/download/e/6/1/e61266b2-d8b4-4fe0-a553-f01dc3976675/hun-hun-StyleGuide.pdf
  The time format is different in common language and in the language of
  IT. In common texts we usually do not abbreviate, so the full forms are
  used: “7 óra 10 perckor csörgött a telefon”. However, the short format,
  consisting of numerals only, can also be used. In this case a period
  must be used between the two numbers and there must not be a space
  between them: “találkozzunk 10.45-kor”.

  However, in software mostly the short format is used, and the numbers
  are separated by a colon. An obvious example is the clock in the bottom
  right corner of your screen, thus 18:31.
2015-10-17 00:15:07 -04:00
Marko Myllynen
441c3b59d1 Fix lang_lib/lang_term as per ISO 639-2 [BZ #16973]
lang_lib (which reflects ISO 639-2/B (bibliographic) codes) and
lang_term (which reflects ISO 639-2/T (terminology) codes) should be
identical except for those languages for which ISO 639-2 specifies
separate bibliographic/terminology values.

I used this Library of Congress page as the source:
	http://www.loc.gov/standards/iso639-2/php/code_list.php
2015-08-18 10:15:04 -04:00
Arslanbek Astemirov
db2bcbcb63 locales/ce_RU: sync with other *_RU locales
[BZ #18618]
* locales/ce_RU (LC_IDENTIFICATION): Fix language.
(LC_TIME): Set first_weekday and first_workday.
(LC_NUMERIC): Copy ru_RU.
2015-08-07 11:10:23 +00:00
Marko Myllynen
42eaa27fac localedata: remove timezone information [BZ #18525]
as discussed in the thread starting at

https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html

it looks like the best options is to remove locale timezone information
from locales which currently provide it (in incomplete or incorrect
fashion) rather than to start duplicating tzdata info in glibc.
2015-08-05 05:02:18 -04:00
Marko Myllynen
f30d94a74a locale: Remove obsolete repertoire map references
repertoire maps and character mnemonics were used early in the glibc
i18n/l10n effort but were quickly deprecated in favor of Unicode code
points. According to ChangeLog, the in-tree repertoire maps were
removed 2000-07-07 but some stray references remain even today. The
patch below removes them.
2015-07-21 03:54:02 -04:00
Khem Raj
8bb524be8f locale: Do not define lang_ab for tcy_IN and bhb_IN
After renaming localedef now complains and build fails

LC_ADDRESS: field `lang_ab' must not be defined

earlier the names were similar to lang_ab definitions 'tu' or 'bh'
but after rename they are not.
2015-07-21 02:52:00 -04:00
Pravin Satpute
032c510db0 Correcting language code for Bhili and Tulu locales (bug 17475)
Bhili [1] and Tulu [2] language does not have iso-639-1 codes. Patch
moves locale file with correct code and also fix iso-639.def.

1. http://www-01.sil.org/iso639-3/documentation.asp?id=bhb
2. http://www-01.sil.org/iso639-3/documentation.asp?id=tcy

localedata/ChangeLog:

2015-07-02  Pravin Satpute  <psatpute@redhat.com>

	[BZ #17475]
        * locales/tu_IN: renamed to tcy_IN
	* locales/bh_IN: renamed to bhb_IN

Changelog:

2015-03-05  Pravin Satpute  <psatpute@redhat.com>

	[BZ #17475]
	* locale/iso-639.def: Update Bhili and Tulu language codes as
	per iso639-3.
2015-07-15 16:06:18 +05:30
Andriy Rysin
6afb9c0175 Fix sorting order for Ukrainian locale (BZ 17293)
In the introduction for the official orthography rules for Ukrainian
language (http://spelling.ulif.org.ua/peredmova.htm) there's a note
that only apostrophe does not affect order of the words when sorting.
As could be seen from the official alphabet the soft sign
(U+044C/U+042C) has its hard position and thus affects the order and
also letters "е" and "є" (CYR-IE: U+0435/U+0415 and UKR-IE:
U+0454/U+0404) have their own positions and should have separate place
when sorting.
This also corresponds to official Unicode collation chart for these
letters: http://unicode.org/charts/collation/chart_Cyrillic.html
2015-05-26 23:51:18 +05:30
Marko Myllynen
c3cc2cf35a Fix bo_CN and bo_IN.
Both bo_CN and bo_IN were not compiling. The following fix
gets them into a usable state again giving a clean build
result for `make localedata/install-locales`.
2015-05-16 01:40:04 -04:00
Christian Schmidt
92566b4922 Update currency_symbol in da_DK 2015-05-07 11:56:56 +05:30
Alexandre Oliva
4a4839c94a Unicode 7.0.0 update; added generator scripts.
for  localedata/ChangeLog

	[BZ #17588]
	[BZ #13064]
	[BZ #14094]
	[BZ #17998]
	* unicode-gen/Makefile: New.
	* unicode-gen/unicode-license.txt: New, from Unicode.
	* unicode-gen/UnicodeData.txt: New, from Unicode.
	* unicode-gen/DerivedCoreProperties.txt: New, from Unicode.
	* unicode-gen/EastAsianWidth.txt: New, from Unicode.
	* unicode-gen/gen_unicode_ctype.py: New generator, from Mike
	FABIAN <mfabian@redhat.com>.
	* unicode-gen/ctype_compatibility.py: New verifier, from
	Pravin Satpute <psatpute@redhat.com> and Mike FABIAN.
	* unicode-gen/ctype_compatibility_test_cases.py: New verifier
	module, from Mike FABIAN.
	* unicode-gen/utf8_gen.py: New generator, from Pravin Satpute
	and Mike FABIAN.
	* unicode-gen/utf8_compatibility.py: New verifier, from Pravin
	Satpute and Mike FABIAN.
	* charmaps/UTF-8: Update.
	* locales/i18n: Update.
	* gen-unicode-ctype.c: Remove.
	* tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns
	true for ordinal indicators.
2015-02-20 20:14:59 -02:00
Pravin Satpute
01839a33ec New locale raj_IN (#16857) 2014-12-01 15:23:47 +05:30
Pravin Satpute
2687f47b20 New locale ce_RU (BZ #17192) 2014-12-01 15:18:33 +05:30
Tatiana Udalova
fb89b46d1d New Bhilodi and Tulu locales (BZ #17475) 2014-11-12 17:06:39 +05:30
Chris Leonard
46c0cc3aa5 Add lang_name to various locales. 2013-12-26 19:35:18 -05:00
Chris Leonard
8b67c0d9e2 Add lang_name to various locales. 2013-12-20 11:03:15 -05:00
Toke Høiland-Jørgensen
f208205092 Add entries for U00D8 and U00F8. 2013-12-12 14:47:25 -05:00
Marko Myllynen
dc14d999e1 Fix Charset comment in fi_FI, fi_FI@euro 2013-12-12 09:24:35 +05:30
Chris Leonard
14b97c7a8c Add lang_name to various locales. 2013-12-01 08:04:54 -05:00
Chris Leonard
2ddb48d376 Add lang_name to various locales. 2013-11-27 15:52:46 -05:00
Chris Leonard
f70893aacf revert hebrew lang_name addition 2013-11-25 15:31:36 -05:00
Chris Leonard
5d771331c1 Add lang_name to various locales. 2013-11-25 15:20:41 -05:00
Chris Leonard
085b5ddfe3 Add lang_name to various locales. 2013-11-24 19:58:39 -05:00
Chris Leonard
05a209fe3b revert error-generated by bs_BA. 2013-11-23 18:07:00 -05:00
Chris Leonard
7ea1ebb5e3 Add lang_name to various locales. 2013-11-23 16:29:48 -05:00
Chris Leonard
d5a4ef504e Add lang_name to various locales. 2013-11-23 13:10:17 -05:00
Chris Leonard
12e0e8c65d Add lang_name to German, English, Spanish, French locales. 2013-11-22 14:27:18 -05:00
Chris Leonard
d33cafadfe Add lang_name to Arabic locales. 2013-11-21 14:43:51 -05:00
Siddhesh Poyarekar
0417b20fe6 Rename Oriya locale to Odia (bug 15601)
The state of Orissa was officially renamed the state to Odisha and the
language from Oriya to Odia in 2010.

References:

http://zeenews.india.com/election09/story.aspx?aid=739995
http://orissamatters.com/2011/11/07/orissa-became-odisha/
http://www.ndtv.com/article/india/parliament-passes-bill-to-change-orissa-s-name-93888
http://orissa.gov.in/e-magazine/Orissareview/2011/Nov/engpdf/9-17.pdf
2013-11-20 17:47:41 +05:30