Microsoft long ago added a mapping for 0x80 to the Euro sign to their
CP936. While GBK 1.0 doesn't include this mapping, it is compatible,
and Microsoft and glibc alias the two codepages. We could split them
apart so GBK wouldn't include the mapping, but that seems like a lot
of work for little gain.
The standard currently in effect (LST ISO 8601:1997) mandates the use
of hyphens (as opposed to full stops, currently) in date formats. It
also matches current CLDR data (v29), Wikipedia's & Wikia's settings,
and Microsoft's Lithuanian Style Guide.
According to "Requirements of information technology in Estonian
language and cultural environment" the monetary symbol should be
written after the amount number:
https://www.evs.ee/products/evs-8-2008
The date format in en_CA/LC_TIME specifies the date format as "%d/%m/%y".
However, it should be "%Y-%m-%d". This is the standard date format in
Canada as specified by the Canadian Standards Association in CSA Z234.5:1989,
which adopts the ISO 8601 standard.
Here's the web page from the National Research Council of Canada
citing ISO 8601 as the standard date/time format in Canada:
http://www.nrc-cnrc.gc.ca/eng/services/time/faq/#Q8
International Standard ISO 8601 specifies numeric representations
of date and time. The recommended full format is of the form
2001-12-31 23:59:28.73 UTC. The intent of this standard is to avoid
confusion in international communications which can arise with the
many different national notations. This format has the advantage
that it permits dates to be readily sorted in chronological order
by computer systems.
Windows 8+ and OS X also switched to this format.
Fix the postal_fmt and country_name entries to continue on the following
line without indentation.
localedata/Changelog:
* locales/de_LI (postal_fmt): Fix indentation.
(country_name): Likewise.
The Principality of Liechtenstein currently does not have a corresponding
locale. Given the links with Switzerland, the best is to base the locale
on the de_CH one (German is the official language) and only change the
country related categories: LC_ADDRESS. and LC_TELEPHONE.
localedata/Changelog:
* locales/de_LI: New locale.
* SUPPORTED: Add de_LI.
Some of the newer symbols we're using are missing translit entries which
causes troubles when generating the locales with older encodings.
tr_TR: ₺ -> "TL"
uz_UZ: ʻ -> "'"
common:
֏ -> "AMD"
₪ -> "ILS"
₱ -> "PHP"
₸ -> "KZT"
₾ -> "GEL"
The current test code doesn't check the return value of malloc.
This should rarely (if ever) cause a problem, but rather than add
some return value checks, just statically allocate the buffer on
the stack. This will never fail (or if it does, we've got much
bigger problems that don't matter to the test).
The yes/no strings should be based on the dictionary words. That means
they are capitalized based on the dictionary rather than position in the
sentence (e.g. the first word).
bo_CN: nostr: changing མེན to མིན།
bo_CN: yesstr: changing ཨིན to ཡིན།
dz_BT: nostr: changing མེན to མེན་
dz_BT: yesstr: changing ཨིན to ཨིན་
en_CA: yesstr: changing Yes to yes
en_CA: nostr: changing No to no
en_US: yesstr: changing Yes to yes
en_US: nostr: changing No to no
es_ES: nostr: changing No to no
es_ES: yesstr: changing Si to sí
fi_FI: nostr: changing Ei to ei
fi_FI: yesstr: changing Kyllä to kyllä
ig_NG: yesstr: changing Ee to Eye
ko_KR: nostr: changing 아니오 to 아니요
ky_KG: nostr: changing Жок to жок
ky_KG: yesstr: changing Ооба to ооба
ms_MY: nostr: changing Tidak to tidak
ms_MY: yesstr: changing Ya to ya
te_IN: nostr: changing కాదు to వద్దు
te_IN: yesstr: changing అవను to అవును
ur_PK: nostr: changing نهيں to نہیں
ur_PK: yesstr: changing بلكل to ہاں
uz_UZ: nostr: changing Yo'q to yo‘q
uz_UZ: yesstr: changing Ha to ha
uz_UZ@cyrillic: nostr: changing Йўқ to йўқ
uz_UZ@cyrillic: yesstr: changing Ҳа to ҳа
wae_CH: nostr: changing Nei to nei
wae_CH: yesstr: changing Ja to ja
yo_NG: nostr: changing Bẹ́ẹ̀ kọ́ to Bẹ́ẹ̀kọ́
yo_NG: yesstr: changing Bẹ́ẹ̀ ni to Bẹ́ẹ̀ni
Some of the translations were just wrong.
el_GR: nostr: changing no to όχι
el_GR: yesstr: changing yes to ναι
km_KH: nostr: changing no:NO:n:N to ទេ៖ n
km_KH: yesstr: changing yes:YES:y:Y to បាទ/ចាស៖ y
ug_CN: nostr: changing No to ياق
ug_CN: yesstr: changing Yes to ھەئە
Add missing translations for a number of locales:
af_ZA: nostr: setting to nee
af_ZA: yesstr: setting to ja
am_ET: nostr: setting to አይ
am_ET: yesstr: setting to አዎን
ast_ES: nostr: setting to non
ast_ES: yesstr: setting to sí
be_BY: nostr: setting to не
be_BY: yesstr: setting to так
bem_ZM: nostr: setting to Awe
bem_ZM: yesstr: setting to Ee
bg_BG: nostr: setting to не
bg_BG: yesstr: setting to да
brx_IN: nostr: setting to नहीं
brx_IN: yesstr: setting to हाँ
bs_BA: nostr: setting to ne
bs_BA: yesstr: setting to da
ca_ES: nostr: setting to no
ca_ES: yesstr: setting to sí
da_DK: nostr: setting to nej
da_DK: yesstr: setting to ja
de_DE: nostr: setting to nein
de_DE: yesstr: setting to ja
en_DK: nostr: setting to yes
en_DK: yesstr: setting to no
et_EE: nostr: setting to ei
et_EE: yesstr: setting to jah
eu_ES: nostr: setting to ez
eu_ES: yesstr: setting to bai
fa_IR: nostr: setting to نه
fa_IR: yesstr: setting to بله
ff_SN: nostr: setting to Alaa
ff_SN: yesstr: setting to Eey
fo_FO: nostr: setting to nei
fo_FO: yesstr: setting to já
fr_BE: nostr: setting to non
fr_BE: yesstr: setting to oui
fr_CH: nostr: setting to non
fr_CH: yesstr: setting to oui
fr_FR: nostr: setting to non
fr_FR: yesstr: setting to oui
fr_LU: nostr: setting to non
fr_LU: yesstr: setting to oui
fur_IT: nostr: setting to no
fur_IT: yesstr: setting to sì
fy_DE: nostr: setting to nee
fy_DE: yesstr: setting to ja
ga_IE: nostr: setting to níl
ga_IE: yesstr: setting to tá
gd_GB: nostr: setting to chan eil
gd_GB: yesstr: setting to tha
gl_ES: nostr: setting to non
gl_ES: yesstr: setting to si
gu_IN: nostr: setting to નહીં
gu_IN: yesstr: setting to હા
he_IL: nostr: setting to לא
he_IL: yesstr: setting to כן
hi_IN: nostr: setting to नहीं
hi_IN: yesstr: setting to हाँ
hr_HR: nostr: setting to ne
hr_HR: yesstr: setting to da
hu_HU: nostr: setting to nem
hu_HU: yesstr: setting to igen
id_ID: nostr: setting to tidak
id_ID: yesstr: setting to ya
is_IS: nostr: setting to nei
is_IS: yesstr: setting to já
it_CH: nostr: setting to no
it_CH: yesstr: setting to sì
it_IT: nostr: setting to no
it_IT: yesstr: setting to sì
ka_GE: nostr: setting to არა
ka_GE: yesstr: setting to კი
kk_KZ: nostr: setting to жоқ
kk_KZ: yesstr: setting to иә
kl_GL: nostr: setting to naagga
kl_GL: yesstr: setting to aap
kn_IN: nostr: setting to ಇಲ್ಲ
kn_IN: yesstr: setting to ಹೌದು
ko_KR: yesstr: setting to 예
lb_LU: nostr: setting to nee
lb_LU: yesstr: setting to jo
lg_UG: nostr: setting to Nedda
lg_UG: yesstr: setting to Ye
lt_LT: nostr: setting to ne
lt_LT: yesstr: setting to taip
lv_LV: nostr: setting to nē
lv_LV: yesstr: setting to jā
mg_MG: nostr: setting to Tsia
mg_MG: yesstr: setting to Eny
mn_MN: nostr: setting to үгүй
mn_MN: yesstr: setting to тийм
mr_IN: nostr: setting to नाहीःना
mr_IN: yesstr: setting to होयःहो
mt_MT: nostr: setting to le
mt_MT: yesstr: setting to iva
nb_NO: nostr: setting to nei
nb_NO: yesstr: setting to ja
ne_NP: nostr: setting to होइन
ne_NP: yesstr: setting to हो
nl_NL: nostr: setting to nee
nl_NL: yesstr: setting to ja
nn_NO: nostr: setting to nei
nn_NO: yesstr: setting to ja
or_IN: nostr: setting to ନା
or_IN: yesstr: setting to ହଁ
os_RU: nostr: setting to нӕйы
os_RU: yesstr: setting to уойы
pa_IN: nostr: setting to ਨਹੀਂ
pa_IN: yesstr: setting to ਹਾਂ
pl_PL: nostr: setting to nie
pl_PL: yesstr: setting to tak
pt_BR: nostr: setting to não
pt_BR: yesstr: setting to sim
pt_PT: nostr: setting to não
pt_PT: yesstr: setting to sim
ro_RO: nostr: setting to nu
ro_RO: yesstr: setting to da
ru_RU: nostr: setting to нет
ru_RU: yesstr: setting to да
ru_UA: nostr: setting to нет
ru_UA: yesstr: setting to да
se_NO: nostr: setting to ii
se_NO: yesstr: setting to jo
sl_SI: nostr: setting to ne
sl_SI: yesstr: setting to da
so_DJ: nostr: setting to maya
so_DJ: yesstr: setting to haa
so_SO: nostr: setting to maya
so_SO: yesstr: setting to haa
sq_AL: nostr: setting to jo
sq_AL: yesstr: setting to po
sr_RS@latin: nostr: setting to ne
sr_RS@latin: yesstr: setting to da
sr_RS: nostr: setting to не
sr_RS: yesstr: setting to да
sv_SE: nostr: setting to nej
sv_SE: yesstr: setting to ja
sw_KE: nostr: setting to Hapana
sw_KE: yesstr: setting to Ndiyo
yue_HK: nostr: setting to 唔係
yue_HK: yesstr: setting to 係
zu_ZA: nostr: setting to cha
zu_ZA: yesstr: setting to yebo
The vast majority of languages include yY/nN in their yes/no regexes.
Standardize the few that were missing them.
ms_MY: noexpr: add nN
nan_TW@latin: yesexpr: add yY
nan_TW@latin: noexpr: add nN
se_NO: noexpr: add nN
This also highlighted a few that were incorrectly using yY/nN because
they clashed with their localized messages:
uz_UZ: yesexpr: change ^[+1YyHh] to ^[+1ҲҳHh]
uz_UZ: noexpr: change ^[-0JjNn] to ^[-0ЙйNnYyJj]
uz_UZ@cyrillic: yesexpr: change ^[+1ҲҳYy] to ^[+1ҲҳHh]
uz_UZ@cyrillic: noexpr: change ^[-0ЙйNn] to [-0ЙйNnYyJj]
yo_NG: move nN (short for Bẹ́ẹ̀ni) from noexpr to yesexpr
A handful of regexes were allowing +1 for yesexpr and -0 for noexpr,
and it's the i18n definition. Standardize all locales by allowing
these language-independent values in them.
Example change for en_US goes from ^[yY] to ^[+1yY], and from ^[nN]
to ^[-0nN].
Tweak some of the collation settings for a few characters.
Add/update various fields:
LC_MESSAGES
yesstr: set to иә
nostr: set to жоқ
LC_MONETARY
mon_decimal_point: change . to ,
mon_thousands_sep: change to a non-breaking space
p_sep_by_space: change 1 to 2
set int_{p,n}_* fields
LC_NUMERIC
thousands_sep: change , to a non-breaking space
LC_TIME
abday: change saturday from Сн to Сб
LC_TELEPHONE
tel_dom_fmt: set to (%A) %l
int_select: set to 8~10
LC_ADDRESS:
country_post: set to KAZ
country_ab2: set to KZ
country_ab3: set to KAZ
country_isbn: set to 978-601
lang_name: set to қазақ тілі
I've spot checked a number of these, including some that were def
wrong (like ff_SN). It also fixes all open week-related bugs.
Since ff_SN is the only one that changes its base date, I also made
sure that its ordering of day translations were correct. Looks like
another case Petr brought up where the week field was not actually
checked against the day arrays.
I also took the opportunity to drop first_weekday/first_workday when
the value aligned with the defaults (1 & 2 respectively). This didn't
impact too many locales In practice because the majority omitted them
already.
A few locales were defining some values incorrectly for their region:
ak_GH: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ak_GH: first_weekday: changing 1 to 2
ayc_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
bem_ZM: week: changing [7, 19971130, 4] to [7, 19971130, 1]
bem_ZM: first_weekday: changing 1 to 2
en_IE: first_weekday: changing 2 to 1
en_US: week: changing [7, 19971130, 7] to [7, 19971130, 1]
es_CO: first_weekday: changing 2 to 1
es_ES: week: changing [7, 19971130, 5] to [7, 19971130, 4]
ff_SN: week: changing [7, 19971129, 1] to [7, 19971130, 1]
ff_SN: first_weekday: changing 1 to 2
ga_IE: first_weekday: changing 2 to 1
ht_HT: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ht_HT: first_weekday: changing 1 to 2
mk_MK: week: changing [7, 19971130, 4] to [7, 19971130, 1]
mt_MT: first_weekday: changing 2 to 1
quz_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
sr_ME: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS@latin: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: first_weekday: changing 2 to 1
uk_UA: week: changing [7, 19971130, 4] to [7, 19971130, 1]
unm_US: week: changing [7, 19971130, 4] to [7, 19971130, 1]
Some locales were copying locales that had the wrong week settings, so
that content had to be duplicated so the values could be adjusted:
el_CY: week: setting to [7, 19971130, 1]
en_AG: week: setting to [7, 19971130, 1]
en_AG: first_weekday: changing 2 to 1
en_ZM: week: setting to [7, 19971130, 1]
es_CU: week: setting to [7, 19971130, 1]
nl_AW: week: setting to [7, 19971130, 1]
sw_TZ: first_weekday: setting to 2
ta_LK: first_weekday: setting to 2
The majority of locales were omitting the week field thus getting the
default [7, 19971130, 0 (localedef) / 7 (ISO standard)]. Unfortunately,
neither of those are used by any locales, so we end up having to define
the field just to se the ndays field. In practice, this rarely matters
due to it usage, and the first two fields match the defaults.
aa_DJ: setting to [7, 19971130, 1]
aa_ER: setting to [7, 19971130, 1]
aa_ER@saaho: setting to [7, 19971130, 1]
aa_ET: setting to [7, 19971130, 1]
af_ZA: setting to [7, 19971130, 1]
am_ET: setting to [7, 19971130, 1]
an_ES: setting to [7, 19971130, 4]
anp_IN: setting to [7, 19971130, 1]
ar_AE: setting to [7, 19971130, 1]
ar_BH: setting to [7, 19971130, 1]
ar_DZ: setting to [7, 19971130, 1]
ar_EG: setting to [7, 19971130, 1]
ar_IN: setting to [7, 19971130, 1]
ar_IQ: setting to [7, 19971130, 1]
ar_JO: setting to [7, 19971130, 1]
ar_KW: setting to [7, 19971130, 1]
ar_LB: setting to [7, 19971130, 1]
ar_LY: setting to [7, 19971130, 1]
ar_MA: setting to [7, 19971130, 1]
ar_OM: setting to [7, 19971130, 1]
ar_QA: setting to [7, 19971130, 1]
ar_SA: setting to [7, 19971130, 1]
ar_SD: setting to [7, 19971130, 1]
ar_SS: setting to [7, 19971130, 1]
ar_SY: setting to [7, 19971130, 1]
ar_TN: setting to [7, 19971130, 1]
ar_YE: setting to [7, 19971130, 1]
as_IN: setting to [7, 19971130, 1]
ast_ES: setting to [7, 19971130, 4]
az_AZ: setting to [7, 19971130, 1]
be_BY: setting to [7, 19971130, 1]
be_BY@latin: setting to [7, 19971130, 1]
ber_DZ: setting to [7, 19971130, 1]
ber_MA: setting to [7, 19971130, 1]
bg_BG: setting to [7, 19971130, 4]
bhb_IN: setting to [7, 19971130, 1]
bho_IN: setting to [7, 19971130, 1]
bn_BD: setting to [7, 19971130, 1]
bn_IN: setting to [7, 19971130, 1]
bo_CN: setting to [7, 19971130, 1]
br_FR: setting to [7, 19971130, 4]
brx_IN: setting to [7, 19971130, 1]
bs_BA: setting to [7, 19971130, 1]
byn_ER: setting to [7, 19971130, 1]
ca_AD: setting to [7, 19971130, 4]
ca_ES: setting to [7, 19971130, 4]
ca_ES@euro: setting to [7, 19971130, 4]
ca_FR: setting to [7, 19971130, 4]
ca_IT: setting to [7, 19971130, 4]
ce_RU: setting to [7, 19971130, 1]
cmn_TW: setting to [7, 19971130, 1]
crh_UA: setting to [7, 19971130, 1]
cv_RU: setting to [7, 19971130, 1]
cy_GB: setting to [7, 19971130, 4]
de_BE: setting to [7, 19971130, 4]
de_LU: setting to [7, 19971130, 4]
doi_IN: setting to [7, 19971130, 1]
dv_MV: setting to [7, 19971130, 1]
dz_BT: setting to [7, 19971130, 1]
el_GR: setting to [7, 19971130, 4]
el_GR@euro: setting to [7, 19971130, 4]
en_AU: setting to [7, 19971130, 1]
en_BW: setting to [7, 19971130, 1]
en_CA: setting to [7, 19971130, 1]
en_HK: setting to [7, 19971130, 1]
en_IE: setting to [7, 19971130, 4]
en_IN: setting to [7, 19971130, 1]
en_NG: setting to [7, 19971130, 1]
en_NZ: setting to [7, 19971130, 1]
en_PH: setting to [7, 19971130, 1]
en_SG: setting to [7, 19971130, 1]
en_ZA: setting to [7, 19971130, 1]
en_ZW: setting to [7, 19971130, 1]
es_AR: setting to [7, 19971130, 1]
es_BO: setting to [7, 19971130, 1]
es_CL: setting to [7, 19971130, 1]
es_CO: setting to [7, 19971130, 1]
es_CR: setting to [7, 19971130, 1]
es_DO: setting to [7, 19971130, 1]
es_EC: setting to [7, 19971130, 1]
es_ES@euro: setting to [7, 19971130, 4]
es_GT: setting to [7, 19971130, 1]
es_HN: setting to [7, 19971130, 1]
es_MX: setting to [7, 19971130, 1]
es_NI: setting to [7, 19971130, 1]
es_PA: setting to [7, 19971130, 1]
es_PE: setting to [7, 19971130, 1]
es_PR: setting to [7, 19971130, 1]
es_PY: setting to [7, 19971130, 1]
es_SV: setting to [7, 19971130, 1]
es_US: setting to [7, 19971130, 1]
es_UY: setting to [7, 19971130, 1]
es_VE: setting to [7, 19971130, 1]
eu_ES: setting to [7, 19971130, 4]
fa_IR: setting to [7, 19971130, 1]
fil_PH: setting to [7, 19971130, 1]
fo_FO: setting to [7, 19971130, 4]
fr_CA: setting to [7, 19971130, 1]
fr_CH: setting to [7, 19971130, 4]
fr_LU: setting to [7, 19971130, 4]
fy_NL: setting to [7, 19971130, 4]
ga_IE: setting to [7, 19971130, 4]
gd_GB: setting to [7, 19971130, 4]
gez_ER: setting to [7, 19971130, 1]
gez_ET: setting to [7, 19971130, 1]
gl_ES: setting to [7, 19971130, 4]
gu_IN: setting to [7, 19971130, 1]
gv_GB: setting to [7, 19971130, 4]
hak_TW: setting to [7, 19971130, 1]
ha_NG: setting to [7, 19971130, 1]
he_IL: setting to [7, 19971130, 1]
hi_IN: setting to [7, 19971130, 1]
hne_IN: setting to [7, 19971130, 1]
hr_HR: setting to [7, 19971130, 1]
hy_AM: setting to [7, 19971130, 1]
id_ID: setting to [7, 19971130, 1]
ig_NG: setting to [7, 19971130, 1]
ik_CA: setting to [7, 19971130, 1]
is_IS: setting to [7, 19971130, 4]
it_CH: setting to [7, 19971130, 4]
it_IT: setting to [7, 19971130, 4]
it_IT@euro: setting to [7, 19971130, 4]
iu_CA: setting to [7, 19971130, 1]
ja_JP: setting to [7, 19971130, 1]
ka_GE: setting to [7, 19971130, 1]
kk_KZ: setting to [7, 19971130, 1]
kl_GL: setting to [7, 19971130, 1]
km_KH: setting to [7, 19971130, 1]
kn_IN: setting to [7, 19971130, 1]
kok_IN: setting to [7, 19971130, 1]
ko_KR: setting to [7, 19971130, 1]
ks_IN: setting to [7, 19971130, 1]
ks_IN@devanagari: setting to [7, 19971130, 1]
ku_TR: setting to [7, 19971130, 1]
kw_GB: setting to [7, 19971130, 4]
ky_KG: setting to [7, 19971130, 1]
lg_UG: setting to [7, 19971130, 1]
lij_IT: setting to [7, 19971130, 4]
lo_LA: setting to [7, 19971130, 1]
lt_LT: setting to [7, 19971130, 4]
lv_LV: setting to [7, 19971130, 1]
lzh_TW: setting to [7, 19971130, 1]
mag_IN: setting to [7, 19971130, 1]
mai_IN: setting to [7, 19971130, 1]
mg_MG: setting to [7, 19971130, 1]
mhr_RU: setting to [7, 19971130, 1]
mi_NZ: setting to [7, 19971130, 1]
ml_IN: setting to [7, 19971130, 1]
mni_IN: setting to [7, 19971130, 1]
mn_MN: setting to [7, 19971130, 1]
mr_IN: setting to [7, 19971130, 1]
ms_MY: setting to [7, 19971130, 1]
mt_MT: setting to [7, 19971130, 1]
my_MM: setting to [7, 19971130, 1]
nan_TW: setting to [7, 19971130, 1]
nan_TW@latin: setting to [7, 19971130, 1]
ne_NP: setting to [7, 19971130, 1]
nhn_MX: setting to [7, 19971130, 1]
niu_NU: setting to [7, 19971130, 1]
niu_NZ: setting to [7, 19971130, 1]
nl_BE: setting to [7, 19971130, 4]
nl_BE@euro: setting to [7, 19971130, 4]
nr_ZA: setting to [7, 19971130, 1]
nso_ZA: setting to [7, 19971130, 1]
oc_FR: setting to [7, 19971130, 4]
om_ET: setting to [7, 19971130, 1]
om_KE: setting to [7, 19971130, 1]
or_IN: setting to [7, 19971130, 1]
os_RU: setting to [7, 19971130, 1]
pa_IN: setting to [7, 19971130, 1]
pap_AW: setting to [7, 19971130, 1]
pap_CW: setting to [7, 19971130, 1]
pa_PK: setting to [7, 19971130, 1]
ps_AF: setting to [7, 19971130, 1]
pt_BR: setting to [7, 19971130, 1]
pt_PT: setting to [7, 19971130, 4]
pt_PT@euro: setting to [7, 19971130, 4]
raj_IN: setting to [7, 19971130, 1]
ro_RO: setting to [7, 19971130, 1]
ru_RU: setting to [7, 19971130, 1]
ru_UA: setting to [7, 19971130, 1]
rw_RW: setting to [7, 19971130, 1]
sa_IN: setting to [7, 19971130, 1]
sat_IN: setting to [7, 19971130, 1]
sd_IN: setting to [7, 19971130, 1]
sd_IN@devanagari: setting to [7, 19971130, 1]
se_NO: setting to [7, 19971130, 4]
shs_CA: setting to [7, 19971130, 1]
sid_ET: setting to [7, 19971130, 1]
si_LK: setting to [7, 19971130, 1]
sl_SI: setting to [7, 19971130, 1]
so_DJ: setting to [7, 19971130, 1]
so_ET: setting to [7, 19971130, 1]
so_KE: setting to [7, 19971130, 1]
so_SO: setting to [7, 19971130, 1]
sq_AL: setting to [7, 19971130, 1]
ss_ZA: setting to [7, 19971130, 1]
st_ZA: setting to [7, 19971130, 1]
sv_FI: setting to [7, 19971130, 4]
sv_SE: setting to [7, 19971130, 4]
ta_IN: setting to [7, 19971130, 1]
tcy_IN: setting to [7, 19971130, 1]
te_IN: setting to [7, 19971130, 1]
tg_TJ: setting to [7, 19971130, 1]
the_NP: setting to [7, 19971130, 1]
th_TH: setting to [7, 19971130, 1]
ti_ER: setting to [7, 19971130, 1]
ti_ET: setting to [7, 19971130, 1]
tig_ER: setting to [7, 19971130, 1]
tk_TM: setting to [7, 19971130, 1]
tl_PH: setting to [7, 19971130, 1]
tn_ZA: setting to [7, 19971130, 1]
tr_CY: setting to [7, 19971130, 1]
tr_TR: setting to [7, 19971130, 1]
ts_ZA: setting to [7, 19971130, 1]
tt_RU: setting to [7, 19971130, 1]
tt_RU@iqtelif: setting to [7, 19971130, 1]
ug_CN: setting to [7, 19971130, 1]
ur_IN: setting to [7, 19971130, 1]
ur_PK: setting to [7, 19971130, 1]
uz_UZ: setting to [7, 19971130, 1]
uz_UZ@cyrillic: setting to [7, 19971130, 1]
ve_ZA: setting to [7, 19971130, 1]
vi_VN: setting to [7, 19971130, 1]
wa_BE: setting to [7, 19971130, 4]
wal_ET: setting to [7, 19971130, 1]
wo_SN: setting to [7, 19971130, 1]
xh_ZA: setting to [7, 19971130, 1]
yi_US: setting to [7, 19971130, 1]
yo_NG: setting to [7, 19971130, 1]
yue_HK: setting to [7, 19971130, 1]
zh_CN: setting to [7, 19971130, 1]
zh_HK: setting to [7, 19971130, 1]
zh_SG: setting to [7, 19971130, 1]
zh_TW: setting to [7, 19971130, 1]
zu_ZA: setting to [7, 19971130, 1]
Finally, set first_weekday in all the locales that were omitting it
and wanted something other than the default of 1.
aa_DJ: setting to 7
aa_ER: setting to 2
aa_ER@saaho: setting to 2
ar_AE: setting to 7
ar_BH: setting to 7
ar_DZ: setting to 7
ar_EG: setting to 7
ar_IQ: setting to 7
ar_JO: setting to 7
ar_KW: setting to 7
ar_LB: setting to 2
ar_LY: setting to 7
ar_MA: setting to 7
ar_OM: setting to 7
ar_QA: setting to 7
ar_SD: setting to 7
ar_SS: setting to 2
ar_SY: setting to 7
az_AZ: setting to 2
be_BY: setting to 2
be_BY@latin: setting to 2
ber_DZ: setting to 7
ber_MA: setting to 7
bn_BD: setting to 6
bs_BA: setting to 2
byn_ER: setting to 2
dv_MV: setting to 6
en_NG: setting to 2
es_BO: setting to 2
es_CL: setting to 2
es_EC: setting to 2
es_UY: setting to 2
fo_FO: setting to 2
fr_CH: setting to 2
gd_GB: setting to 2
gez_ER: setting to 2
ha_NG: setting to 2
hr_HR: setting to 2
hy_AM: setting to 2
ig_NG: setting to 2
is_IS: setting to 2
it_CH: setting to 2
ka_GE: setting to 2
kk_KZ: setting to 2
kl_GL: setting to 2
ku_TR: setting to 2
ky_KG: setting to 2
lg_UG: setting to 2
mg_MG: setting to 2
mn_MN: setting to 2
ms_MY: setting to 2
niu_NU: setting to 2
pap_AW: setting to 2
pap_CW: setting to 2
pt_PT: setting to 2
pt_PT@euro: setting to 2
rw_RW: setting to 2
se_NO: setting to 2
si_LK: setting to 2
so_DJ: setting to 7
so_SO: setting to 2
sq_AL: setting to 2
tg_TJ: setting to 2
ti_ER: setting to 2
tig_ER: setting to 2
tk_TM: setting to 2
tt_RU: setting to 2
tt_RU@iqtelif: setting to 2
uz_UZ: setting to 2
uz_UZ@cyrillic: setting to 2
vi_VN: setting to 2
wo_SN: setting to 2
yo_NG: setting to 2
A bunch of locales were copying the wrong source locale -- looks like they
were basically TODOs from the original imports. This lead to bad values
for int_prefix for them.
Very few locales set audience/application/abbreviation, and
even the ones that do, set them largely to default/useless
values. Drop them from the few locales until we decide we
want to set these everywhere (to something useful).
This updates a few locales based on CLDR v29 data. I've verified most by
hand while the rest I know are correct.
For int_curr_symbol, it should be 3 characters followed by a space:
ar_SS: changing SDG to SSP
bem_ZM: changing ZMK to ZMW
dz_BT: changing BTN to BTN # Just changing " " to "<U0020>".
en_ZW: changing ZWD to USD
es_SV: changing SVC to USD
lv_LV: changing LVL to EUR
ne_NP: changing INR to NPR
pap_AW: changing ANG to AWG
the_NP: changing INR to NPR
Some of these require updates iso-4217.def.
For currency_symbol, it should be the standard/localized symbol name:
aa_DJ: changing $ to Fdj
ar_SA: changing ريال to ر.س
ar_SS: changing ج.س. to £
az_AZ: changing man. to ₼
bg_BG: changing лв to лв.
ce_RU: changing руб to ₽
crh_UA: changing gr to ₴
cv_RU: changing t to ₽
de_CH: changing Fr. to CHF
dz_BT: changing དངུལ་ཀྲམ་ to Nu.
en_BW: changing Pu to P
en_DK: changing ¤ to kr.
en_PH: changing Php to ₱
en_ZW: changing Z$ to $
es_BO: changing $b to Bs
es_DO: changing $ to RD$
es_HN: changing L. to L
es_PA: changing B/ to B/.
es_SV: changing ₡ to $
fil_PH: changing PhP to ₱
he_IL: changing שח to ₪
hy_AM: changing Դ to ֏
ka_GE: changing ლ to ₾
kk_KZ: changing тг to ₸
ko_KR: changing ₩ to ₩
lg_UG: changing /- to USh
lv_LV: changing Ls to €
mg_MG: changing AR to Ar
mhr_RU: changing ТЕҤ to ₽
my_MM: changing Ks to K
os_RU: changing сом to ₽
pap_AW: changing f to ƒ
pap_CW: changing f to ƒ
ps_AF: changing افغانۍ to ؋
rw_RW: changing Frw to FRw
ru_RU: changing руб to ₽
ru_UA: changing гр to ₴
sd_IN@devanagari: changing रु to ₹
se_NO: changing ru to kr
si_LK: changing ₨ to රු
so_SO: changing $ to S
sq_AL: changing Lek to L
ti_ER: changing $ to Nfk
ti_ET: changing $ to Br
tl_PH: changing PhP to ₱
tr_TR: changing TL to ₺
tt_RU: changing руб to ₽
tt_RU@iqtelif: changing sum to ₽
uz_UZ: changing so'm to soʻm
Note: Some of the characters might not render as they're still quite new
in the Unicode database.
Currently localedef accepts any value for the category keyword. This has
allowed bad values to propagate to the vast majority of locales (~90%).
Add some logic to only accept a few standards.
The ISO 30112 standard defines the valid values for the category
keyword as only a few options:
posix:1993
i18n:2004
i18n:2012
The vast majority of locales had changed the "i18n" string to the
name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
date (presumably thinking it should be the date of submission).
Convert all of them to "i18n:2012" for consistency. A follow up
change will update localedef to actually check/validate the field.
This updates a bunch of locales based on CLDR v29 data:
bg_BG: changing Bulgaria to България
bo_CN: changing ཀྲུང་ཧྭ་མི་དམངས་སྤྱི་མཐུན་རྒྱལ་ཁབ། to རྒྱ་ནག
bo_IN: changing རྒྱ་གར to རྒྱ་གར་
cy_GB: changing Cymru to Y Deyrnas Unedig
dz_BT: changing འབྲུག། to འབྲུག
en_US: changing USA to United States
es_US: changing USA to Estados Unidos
gd_GB: changing Breatainn Mhòr to An Rìoghachd Aonaichte
ha_NG: changing Nigeria to Najeriya
mk_MK: changing Macedonia to Македонија
mn_MN: changing Mongolia to Монгол
sq_MK: changing Macedonia to Maqedoni
sr_RS@latin: changing Srbija i Crna Gora to Srbija
tr_CY: changing Northern Cyprus to Kıbrıs
tr_TR: changing Turkey to Türkiye
ug_CN: changing 中华人民共和国 to جۇڭگو
uz_UZ: changing O'zbekistan to Oʻzbekiston
vi_VN: changing Việt nam to Việt Nam
wae_CH: changing Switzerland to Schwiz
yi_US: changing די פֿאראײניקטע שטאַטן to פֿאַראייניגטע שטאַטן
yo_NG: changing Nigeria to Orílẹ́ède Nàìjíríà
yue_HK: changing 香港 to 中華人民共和國香港特別行政區
zu_ZA: changing Mzansi Afrika to i-South Africa
These all look largely straightforward. Many had English translations
instead of native, and a few have been updated. I can't verify some of
them as I'm not personally familiar, but the CLDR data matches.
The USA->United States seems a little odd, but that is also what the
CLDR database uses everywhere (rather than "United States of America").
We can also fill in a country name where there wasn't one before.
Many look correct to me (mostly the English ones), but there's also
many that I have no idea. But it can't be worse than leaving it
blank ? :)
ar_AE: changing to الإمارات العربية المتحدة
ar_BH: changing to البحرين
ar_DZ: changing to الجزائر
ar_EG: changing to مصر
ar_IN: changing to الهند
ar_IQ: changing to العراق
ar_JO: changing to الأردن
ar_KW: changing to الكويت
ar_LB: changing to لبنان
ar_LY: changing to ليبيا
ar_MA: changing to المغرب
ar_OM: changing to عُمان
ar_QA: changing to قطر
ar_SA: changing to المملكة العربية السعودية
ar_SD: changing to السودان
ar_SS: changing to جنوب السودان
ar_SY: changing to سوريا
ar_TN: changing to تونس
ar_YE: changing to اليمن
as_IN: changing to ভাৰত
ast_ES: changing to España
az_AZ: changing to Azərbaycan
be_BY: changing to Беларусь
bn_IN: changing to ভারত
br_FR: changing to Frañs
brx_IN: changing to भारत
bs_BA: changing to Bosna i Hercegovina
ca_AD: changing to Andorra
ca_ES: changing to Espanya
ca_FR: changing to França
ca_IT: changing to Itàlia
ce_RU: changing to Росси
da_DK: changing to Danmark
de_AT: changing to Österreich
de_BE: changing to Belgien
de_CH: changing to Schweiz
de_LU: changing to Luxemburg
el_CY: changing to Κύπρος
el_GR: changing to Ελλάδα
en_AG: changing to Antigua & Barbuda
en_AU: changing to Australia
en_BW: changing to Botswana
en_CA: changing to Canada
en_DK: changing to Denmark
en_GB: changing to United Kingdom
en_HK: changing to Hong Kong SAR China
en_IE: changing to Ireland
en_IN: changing to India
en_NZ: changing to New Zealand
en_PH: changing to Philippines
en_SG: changing to Singapore
en_ZW: changing to Zimbabwe
es_AR: changing to Argentina
es_BO: changing to Bolivia
es_CL: changing to Chile
es_CO: changing to Colombia
es_CU: changing to Cuba
es_DO: changing to República Dominicana
es_EC: changing to Ecuador
es_ES: changing to España
es_GT: changing to Guatemala
es_HN: changing to Honduras
es_MX: changing to México
es_NI: changing to Nicaragua
es_PA: changing to Panamá
es_PE: changing to Perú
es_PR: changing to Puerto Rico
es_PY: changing to Paraguay
es_SV: changing to El Salvador
es_UY: changing to Uruguay
es_VE: changing to Venezuela
eu_ES: changing to Espainia
fil_PH: changing to Pilipinas
fo_FO: changing to Føroyar
fr_BE: changing to Belgique
fr_CA: changing to Canada
fr_CH: changing to Suisse
fr_FR: changing to France
fr_LU: changing to Luxembourg
fur_IT: changing to Italie
fy_DE: changing to Dútslân
fy_NL: changing to Nederlân
ga_IE: changing to Éire
gl_ES: changing to España
gu_IN: changing to ભારત
gv_GB: changing to Rywvaneth Unys
he_IL: changing to ישראל
hi_IN: changing to भारत
hr_HR: changing to Hrvatska
hu_HU: changing to Magyarország
id_ID: changing to Indonesia
is_IS: changing to Ísland
it_CH: changing to Svizzera
it_IT: changing to Italia
ja_JP: changing to 日本
ka_GE: changing to საქართველო
kk_KZ: changing to Қазақстан
kl_GL: changing to Kalaallit Nunaat
kn_IN: changing to ಭಾರತ
kok_IN: changing to भारत
ko_KR: changing to 대한민국
ks_IN: changing to ہِنٛدوستان
ks_IN@devanagari: changing to भारत
kw_GB: changing to Rywvaneth Unys
ky_KG: changing to Кыргызстан
lt_LT: changing to Lietuva
lv_LV: changing to Latvija
mg_MG: changing to Madagasikara
ml_IN: changing to ഇന്ത്യ
mr_IN: changing to भारत
ms_MY: changing to Malaysia
mt_MT: changing to Malta
nb_NO: changing to Norge
ne_NP: changing to नेपाल
nl_AW: changing to Aruba
nl_BE: changing to België
nl_NL: changing to Nederland
nn_NO: changing to Noreg
or_IN: changing to ଭାରତ
os_RU: changing to Уӕрӕсе
pa_IN: changing to ਭਾਰਤ
pa_PK: changing to ਪਾਕਿਸਤਾਨ
pl_PL: changing to Polska
pt_BR: changing to Brasil
pt_PT: changing to Portugal
ru_RU: changing to Россия
ru_UA: changing to Украина
sd_IN@devanagari: changing to भारत
se_NO: changing to Norga
si_LK: changing to ශ්රී ලංකාව
sk_SK: changing to Slovensko
sl_SI: changing to Slovenija
sq_AL: changing to Shqipëri
sv_SE: changing to Sverige
ta_IN: changing to இந்தியா
ta_LK: changing to இலங்கை
ur_IN: changing to بھارت
ur_PK: changing to پاکستان
These entries have been checked mostly against Wikipedia, but also using
the sources it cites (like the UN and other treaty sources).
Fix incorrect values:
en_BW: changing RB to BW
kl_GL: changing GRO to KN
km_KH: changing LAO to KH
my_MM: changing BA to MYA
oc_FR: changing F to F
tr_CY: changing TR to CY
wae_CH: changing DH to CH
Add missing entries:
aa_DJ: changing to DJI
ak_GH: changing to GH
ar_OM: changing to OM
ar_SS: changing to SUD
ar_YE: changing to YAR
bo_CN: changing to CHN
cmn_TW: changing to RC
dv_MV: changing to MV
dz_BT: changing to BHT
en_AG: changing to AG
es_HN: changing to HN
es_PR: changing to PR
hak_TW: changing to RC
lzh_TW: changing to RC
nan_TW: changing to RC
nan_TW@latin: changing to RC
nl_AW: changing to AUA
pap_AW: changing to AUA
so_DJ: changing to DJI
the_NP: changing to NEP
ug_CN: changing to CHN
yue_HK: changing to HK
zh_CN: changing to CHN
zh_HK: changing to HK
zh_TW: changing to RC
This updates a few locales based on CLDR v29 data.
Add missing fields:
as_IN: changing to 356
dv_MV: changing to 462
kk_KZ: changing to 398
my_MM: changing to 104
rw_RW: changing to 646
tt_RU: changing to 643
Update ones that are wrong:
dz_BT: changing BHU to 064
en_PH: changing 360 to 608
km_KH: changing 418 to 116
ky_KG: changing 643 to 417
tr_CY: changing 792 to 196
wo_SN: changing 450 to 686
As a result of fixing these, I had to update country_ab[23]:
dz_BT: changing BHU to BTN
en_PH: changing ID/IDN to PH/PHL
km_KH: changing LA/LAO to KH/KHM
ky_KG: changing KY/KYR to KG/KGZ
tr_CY: changing TR/TUR to CY/CYP
wo_SN: changing MG/MDG to SN/SEN
Pad with leading zeros to match the standard and other locales:
ber_DZ: changing 12 to 012
ca_AD: changing 20 to 020
en_AG: changing 28 to 028
hy_AM: changing 51 to 051
li_BE: changing 56 to 056
wa_BE: changing 56 to 056
I hand checked the first two sets against ISO 3166-1 directly.
There are only two page sizes that locales use: US-Letter and A4.
For the former, move to copying the en_US locale, while for the
latter, move to copying the i18n locale. This lets us clean up
all the stray comments like FIXME.
There should be no functional differences here.
There are only two measurement systems that locales use: US and metric.
For the former, move to copying the en_US locale, while for the latter,
move to copying the i18n locale. This lets us clean up all the stray
comments like FIXME.
There should be no functional differences here.
This updates all the territory fields based on CLDR v29 data. Many of
them were obviously incorrect where people used a two letter code and
not the English name.
aa_DJ: changing DJ to Djibouti
aa_ER@saaho: changing ER to Eritrea
aa_ER: changing ER to Eritrea
aa_ET: changing ET to Ethiopia
am_ET: changing ET to Ethiopia
ar_LY: changing Libyan Arab Jamahiriya to Libya
ar_SY: changing Syrian Arab Republic to Syria
bo_CN: changing P.R. of China to China
bs_BA: changing Bosnia and Herzegowina to Bosnia & Herzegovina
byn_ER: changing ER to Eritrea
ca_IT: changing Italy (L'Alguer) to Italy
ce_RU: changing RUSSIAN FEDERATION to Russia
cmn_TW: changing Republic of China to Taiwan
cy_GB: changing Great Britain to United Kingdom
de_LU@euro: changing Luxemburg to Luxembourg
de_LU: changing Luxemburg to Luxembourg
en_AG: changing Antigua and Barbuda to Antigua & Barbuda
en_GB: changing Great Britain to United Kingdom
en_HK: changing Hong Kong to Hong Kong SAR China
en_US: changing USA to United States
es_US: changing USA to United States
fr_LU@euro: changing Luxemburg to Luxembourg
fr_LU: changing Luxemburg to Luxembourg
fy_DE: changing DE to Germany
gd_GB: changing Great Britain to United Kingdom
gez_ER@abegede: changing ER to Eritrea
gez_ER: changing ER to Eritrea
gez_ET@abegede: changing ET to Ethiopia
gez_ET: changing ET to Ethiopia
gv_GB: changing Britain to United Kingdom
hak_TW: changing Republic of China to Taiwan
iu_CA: changing CA to Canada
ko_KR: changing Republic of Korea to South Korea
kw_GB: changing Britain to United Kingdom
li_BE: changing BE to Belgium
li_NL: changing NL to Netherlands
lzh_TW: changing Republic of China to Taiwan
my_MM: changing Myanmar to Myanmar (Burma)
nan_TW: changing Republic of China to Taiwan
nds_DE: changing DE to Germany
nds_NL: changing NL to Netherlands
om_ET: changing ET to Ethiopia
om_KE: changing KE to Kenya
pap_AW: changing AW to Aruba
pap_CW: changing CW to Curaçao
pt_BR: changing Brasil to Brazil
sid_ET: changing ET to Ethiopia
sk_SK: changing Slovak to Slovakia
so_DJ: changing DJ to Djibouti
so_ET: changing ET to Ethiopia
so_KE: changing KE to Kenya
so_SO: changing SO to Somalia
ti_ER: changing ER to Eritrea
ti_ET: changing ET to Ethiopia
tig_ER: changing ER to Eritrea
tt_RU@iqtelif: changing Tatarstan, Russian Federation to Russia
uk_UA: changing UA to Ukraine
unm_US: changing USA to United States
wal_ET: changing ET to Ethiopia
yi_US: changing USA to United States
yue_HK: changing Hong Kong to Hong Kong SAR China
zh_CN: changing P.R. of China to China
zh_HK: changing Hong Kong to Hong Kong SAR China
zh_TW: changing Taiwan R.O.C. to Taiwan
This updates all the language fields based on CLDR v29 data. Many of
them were obviously incorrect where people used a two letter code and
not the English name.
aa_DJ: changing aa to Afar
aa_ER: changing aa to Afar
aa_ER@saaho: changing aa to Afar
aa_ET: changing aa to Afar
am_ET: changing am to Amharic
az_AZ: changing Azeri to Azerbaijani
bn_BD: changing Bengali/Bangla to Bengali
byn_ER: changing byn to Blin
de_AT: changing German to Austrian German
de_CH: changing German to Swiss High German
en_AU: changing English to Australian English
en_CA: changing English to Canadian English
en_GB: changing English to British English
en_US: changing English to American English
es_ES: changing Spanish to European Spanish
es_MX: changing Spanish to Mexican Spanish
ff_SN: changing ff to Fulah
fr_CA: changing French to Canadian French
fr_CH: changing French to Swiss French
fur_IT: changing Furlan to Friulian
fy_DE: changing fy to Western Frisian
fy_NL: changing Frisian to Western Frisian
gd_GB: changing Scots Gaelic to Scottish Gaelic
gez_ER@abegede: changing gez to Geez
gez_ER: changing gez to Geez
gez_ET@abegede: changing gez to Geez
gez_ET: changing gez to Geez
gv_GB: changing Manx Gaelic to Manx
ht_HT: changing Kreyol to Haitian Creole
kl_GL: changing Greenlandic to Kalaallisut
lg_UG: changing Luganda to Ganda
li_BE: changing li to Limburgish
li_NL: changing li to Limburgish
nan_TW@latin: changing Minnan to Min Nan Chinese
nb_NO: changing Norwegian, Bokmål to Norwegian Bokmål
nds_DE: changing nds to Low German
nds_NL: changing nds to Low Saxon
niu_NU: changing Vagahau Niue (Niuean) to Niuean
niu_NZ: changing Vagahau Niue (Niuean) to Niuean
nl_BE: changing Dutch to Flemish
nn_NO: changing Norwegian, Nynorsk to Norwegian Nynorsk
nr_ZA: changing Southern Ndebele to South Ndebele
om_ET: changing om to Oromo
om_KE: changing om to Oromo
or_IN: changing Odia to Oriya
os_RU: changing Ossetian to Ossetic
pap_AW: changing pap to Papiamento
pap_CW: changing pap to Papiamento
pa_PK: changing Punjabi (Shahmukhi) to Punjabi
pt_BR: changing Portuguese to Brazilian Portuguese
pt_PT: changing Portuguese to European Portuguese
se_NO: changing Northern Saami to Northern Sami
sid_ET: changing sid to Sidamo
so_DJ: changing so to Somali
so_ET: changing so to Somali
so_KE: changing so to Somali
so_SO: changing so to Somali
st_ZA: changing Sotho to Southern Sotho
sw_KE: changing sw to Swahili
sw_TZ: changing sw to Swahili
ti_ER: changing ti to Tigrinya
ti_ET: changing ti to Tigrinya
tig_ER: changing tig to Tigre
uk_UA: changing uk to Ukrainian
wal_ET: changing wal to Wolaytta
yue_HK: changing Yue Chinese to Cantonese
There's no real value in populating this field when it's the same as the
default POSIX setting, so drop it from most locales so it's clear what's
going on.
These locales should be using A4 paper size rather than US-Letter.
Update the copy points to match the others in the file. All other
locales have been verified against the CLDR and hand checking.
From the bug:
Obsolete locale. The ISO-639 code for Hebrew was changed from 'iw'
to 'he' in 1989, according to Bruno Haible on libc-alpha 2003-09-01.
Reported-by: Chris Leonard <cjlhomeaddress@gmail.com>