A handful of regexes were allowing +1 for yesexpr and -0 for noexpr,
and it's the i18n definition. Standardize all locales by allowing
these language-independent values in them.
Example change for en_US goes from ^[yY] to ^[+1yY], and from ^[nN]
to ^[-0nN].
I've spot checked a number of these, including some that were def
wrong (like ff_SN). It also fixes all open week-related bugs.
Since ff_SN is the only one that changes its base date, I also made
sure that its ordering of day translations were correct. Looks like
another case Petr brought up where the week field was not actually
checked against the day arrays.
I also took the opportunity to drop first_weekday/first_workday when
the value aligned with the defaults (1 & 2 respectively). This didn't
impact too many locales In practice because the majority omitted them
already.
A few locales were defining some values incorrectly for their region:
ak_GH: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ak_GH: first_weekday: changing 1 to 2
ayc_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
bem_ZM: week: changing [7, 19971130, 4] to [7, 19971130, 1]
bem_ZM: first_weekday: changing 1 to 2
en_IE: first_weekday: changing 2 to 1
en_US: week: changing [7, 19971130, 7] to [7, 19971130, 1]
es_CO: first_weekday: changing 2 to 1
es_ES: week: changing [7, 19971130, 5] to [7, 19971130, 4]
ff_SN: week: changing [7, 19971129, 1] to [7, 19971130, 1]
ff_SN: first_weekday: changing 1 to 2
ga_IE: first_weekday: changing 2 to 1
ht_HT: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ht_HT: first_weekday: changing 1 to 2
mk_MK: week: changing [7, 19971130, 4] to [7, 19971130, 1]
mt_MT: first_weekday: changing 2 to 1
quz_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
sr_ME: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS@latin: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: first_weekday: changing 2 to 1
uk_UA: week: changing [7, 19971130, 4] to [7, 19971130, 1]
unm_US: week: changing [7, 19971130, 4] to [7, 19971130, 1]
Some locales were copying locales that had the wrong week settings, so
that content had to be duplicated so the values could be adjusted:
el_CY: week: setting to [7, 19971130, 1]
en_AG: week: setting to [7, 19971130, 1]
en_AG: first_weekday: changing 2 to 1
en_ZM: week: setting to [7, 19971130, 1]
es_CU: week: setting to [7, 19971130, 1]
nl_AW: week: setting to [7, 19971130, 1]
sw_TZ: first_weekday: setting to 2
ta_LK: first_weekday: setting to 2
The majority of locales were omitting the week field thus getting the
default [7, 19971130, 0 (localedef) / 7 (ISO standard)]. Unfortunately,
neither of those are used by any locales, so we end up having to define
the field just to se the ndays field. In practice, this rarely matters
due to it usage, and the first two fields match the defaults.
aa_DJ: setting to [7, 19971130, 1]
aa_ER: setting to [7, 19971130, 1]
aa_ER@saaho: setting to [7, 19971130, 1]
aa_ET: setting to [7, 19971130, 1]
af_ZA: setting to [7, 19971130, 1]
am_ET: setting to [7, 19971130, 1]
an_ES: setting to [7, 19971130, 4]
anp_IN: setting to [7, 19971130, 1]
ar_AE: setting to [7, 19971130, 1]
ar_BH: setting to [7, 19971130, 1]
ar_DZ: setting to [7, 19971130, 1]
ar_EG: setting to [7, 19971130, 1]
ar_IN: setting to [7, 19971130, 1]
ar_IQ: setting to [7, 19971130, 1]
ar_JO: setting to [7, 19971130, 1]
ar_KW: setting to [7, 19971130, 1]
ar_LB: setting to [7, 19971130, 1]
ar_LY: setting to [7, 19971130, 1]
ar_MA: setting to [7, 19971130, 1]
ar_OM: setting to [7, 19971130, 1]
ar_QA: setting to [7, 19971130, 1]
ar_SA: setting to [7, 19971130, 1]
ar_SD: setting to [7, 19971130, 1]
ar_SS: setting to [7, 19971130, 1]
ar_SY: setting to [7, 19971130, 1]
ar_TN: setting to [7, 19971130, 1]
ar_YE: setting to [7, 19971130, 1]
as_IN: setting to [7, 19971130, 1]
ast_ES: setting to [7, 19971130, 4]
az_AZ: setting to [7, 19971130, 1]
be_BY: setting to [7, 19971130, 1]
be_BY@latin: setting to [7, 19971130, 1]
ber_DZ: setting to [7, 19971130, 1]
ber_MA: setting to [7, 19971130, 1]
bg_BG: setting to [7, 19971130, 4]
bhb_IN: setting to [7, 19971130, 1]
bho_IN: setting to [7, 19971130, 1]
bn_BD: setting to [7, 19971130, 1]
bn_IN: setting to [7, 19971130, 1]
bo_CN: setting to [7, 19971130, 1]
br_FR: setting to [7, 19971130, 4]
brx_IN: setting to [7, 19971130, 1]
bs_BA: setting to [7, 19971130, 1]
byn_ER: setting to [7, 19971130, 1]
ca_AD: setting to [7, 19971130, 4]
ca_ES: setting to [7, 19971130, 4]
ca_ES@euro: setting to [7, 19971130, 4]
ca_FR: setting to [7, 19971130, 4]
ca_IT: setting to [7, 19971130, 4]
ce_RU: setting to [7, 19971130, 1]
cmn_TW: setting to [7, 19971130, 1]
crh_UA: setting to [7, 19971130, 1]
cv_RU: setting to [7, 19971130, 1]
cy_GB: setting to [7, 19971130, 4]
de_BE: setting to [7, 19971130, 4]
de_LU: setting to [7, 19971130, 4]
doi_IN: setting to [7, 19971130, 1]
dv_MV: setting to [7, 19971130, 1]
dz_BT: setting to [7, 19971130, 1]
el_GR: setting to [7, 19971130, 4]
el_GR@euro: setting to [7, 19971130, 4]
en_AU: setting to [7, 19971130, 1]
en_BW: setting to [7, 19971130, 1]
en_CA: setting to [7, 19971130, 1]
en_HK: setting to [7, 19971130, 1]
en_IE: setting to [7, 19971130, 4]
en_IN: setting to [7, 19971130, 1]
en_NG: setting to [7, 19971130, 1]
en_NZ: setting to [7, 19971130, 1]
en_PH: setting to [7, 19971130, 1]
en_SG: setting to [7, 19971130, 1]
en_ZA: setting to [7, 19971130, 1]
en_ZW: setting to [7, 19971130, 1]
es_AR: setting to [7, 19971130, 1]
es_BO: setting to [7, 19971130, 1]
es_CL: setting to [7, 19971130, 1]
es_CO: setting to [7, 19971130, 1]
es_CR: setting to [7, 19971130, 1]
es_DO: setting to [7, 19971130, 1]
es_EC: setting to [7, 19971130, 1]
es_ES@euro: setting to [7, 19971130, 4]
es_GT: setting to [7, 19971130, 1]
es_HN: setting to [7, 19971130, 1]
es_MX: setting to [7, 19971130, 1]
es_NI: setting to [7, 19971130, 1]
es_PA: setting to [7, 19971130, 1]
es_PE: setting to [7, 19971130, 1]
es_PR: setting to [7, 19971130, 1]
es_PY: setting to [7, 19971130, 1]
es_SV: setting to [7, 19971130, 1]
es_US: setting to [7, 19971130, 1]
es_UY: setting to [7, 19971130, 1]
es_VE: setting to [7, 19971130, 1]
eu_ES: setting to [7, 19971130, 4]
fa_IR: setting to [7, 19971130, 1]
fil_PH: setting to [7, 19971130, 1]
fo_FO: setting to [7, 19971130, 4]
fr_CA: setting to [7, 19971130, 1]
fr_CH: setting to [7, 19971130, 4]
fr_LU: setting to [7, 19971130, 4]
fy_NL: setting to [7, 19971130, 4]
ga_IE: setting to [7, 19971130, 4]
gd_GB: setting to [7, 19971130, 4]
gez_ER: setting to [7, 19971130, 1]
gez_ET: setting to [7, 19971130, 1]
gl_ES: setting to [7, 19971130, 4]
gu_IN: setting to [7, 19971130, 1]
gv_GB: setting to [7, 19971130, 4]
hak_TW: setting to [7, 19971130, 1]
ha_NG: setting to [7, 19971130, 1]
he_IL: setting to [7, 19971130, 1]
hi_IN: setting to [7, 19971130, 1]
hne_IN: setting to [7, 19971130, 1]
hr_HR: setting to [7, 19971130, 1]
hy_AM: setting to [7, 19971130, 1]
id_ID: setting to [7, 19971130, 1]
ig_NG: setting to [7, 19971130, 1]
ik_CA: setting to [7, 19971130, 1]
is_IS: setting to [7, 19971130, 4]
it_CH: setting to [7, 19971130, 4]
it_IT: setting to [7, 19971130, 4]
it_IT@euro: setting to [7, 19971130, 4]
iu_CA: setting to [7, 19971130, 1]
ja_JP: setting to [7, 19971130, 1]
ka_GE: setting to [7, 19971130, 1]
kk_KZ: setting to [7, 19971130, 1]
kl_GL: setting to [7, 19971130, 1]
km_KH: setting to [7, 19971130, 1]
kn_IN: setting to [7, 19971130, 1]
kok_IN: setting to [7, 19971130, 1]
ko_KR: setting to [7, 19971130, 1]
ks_IN: setting to [7, 19971130, 1]
ks_IN@devanagari: setting to [7, 19971130, 1]
ku_TR: setting to [7, 19971130, 1]
kw_GB: setting to [7, 19971130, 4]
ky_KG: setting to [7, 19971130, 1]
lg_UG: setting to [7, 19971130, 1]
lij_IT: setting to [7, 19971130, 4]
lo_LA: setting to [7, 19971130, 1]
lt_LT: setting to [7, 19971130, 4]
lv_LV: setting to [7, 19971130, 1]
lzh_TW: setting to [7, 19971130, 1]
mag_IN: setting to [7, 19971130, 1]
mai_IN: setting to [7, 19971130, 1]
mg_MG: setting to [7, 19971130, 1]
mhr_RU: setting to [7, 19971130, 1]
mi_NZ: setting to [7, 19971130, 1]
ml_IN: setting to [7, 19971130, 1]
mni_IN: setting to [7, 19971130, 1]
mn_MN: setting to [7, 19971130, 1]
mr_IN: setting to [7, 19971130, 1]
ms_MY: setting to [7, 19971130, 1]
mt_MT: setting to [7, 19971130, 1]
my_MM: setting to [7, 19971130, 1]
nan_TW: setting to [7, 19971130, 1]
nan_TW@latin: setting to [7, 19971130, 1]
ne_NP: setting to [7, 19971130, 1]
nhn_MX: setting to [7, 19971130, 1]
niu_NU: setting to [7, 19971130, 1]
niu_NZ: setting to [7, 19971130, 1]
nl_BE: setting to [7, 19971130, 4]
nl_BE@euro: setting to [7, 19971130, 4]
nr_ZA: setting to [7, 19971130, 1]
nso_ZA: setting to [7, 19971130, 1]
oc_FR: setting to [7, 19971130, 4]
om_ET: setting to [7, 19971130, 1]
om_KE: setting to [7, 19971130, 1]
or_IN: setting to [7, 19971130, 1]
os_RU: setting to [7, 19971130, 1]
pa_IN: setting to [7, 19971130, 1]
pap_AW: setting to [7, 19971130, 1]
pap_CW: setting to [7, 19971130, 1]
pa_PK: setting to [7, 19971130, 1]
ps_AF: setting to [7, 19971130, 1]
pt_BR: setting to [7, 19971130, 1]
pt_PT: setting to [7, 19971130, 4]
pt_PT@euro: setting to [7, 19971130, 4]
raj_IN: setting to [7, 19971130, 1]
ro_RO: setting to [7, 19971130, 1]
ru_RU: setting to [7, 19971130, 1]
ru_UA: setting to [7, 19971130, 1]
rw_RW: setting to [7, 19971130, 1]
sa_IN: setting to [7, 19971130, 1]
sat_IN: setting to [7, 19971130, 1]
sd_IN: setting to [7, 19971130, 1]
sd_IN@devanagari: setting to [7, 19971130, 1]
se_NO: setting to [7, 19971130, 4]
shs_CA: setting to [7, 19971130, 1]
sid_ET: setting to [7, 19971130, 1]
si_LK: setting to [7, 19971130, 1]
sl_SI: setting to [7, 19971130, 1]
so_DJ: setting to [7, 19971130, 1]
so_ET: setting to [7, 19971130, 1]
so_KE: setting to [7, 19971130, 1]
so_SO: setting to [7, 19971130, 1]
sq_AL: setting to [7, 19971130, 1]
ss_ZA: setting to [7, 19971130, 1]
st_ZA: setting to [7, 19971130, 1]
sv_FI: setting to [7, 19971130, 4]
sv_SE: setting to [7, 19971130, 4]
ta_IN: setting to [7, 19971130, 1]
tcy_IN: setting to [7, 19971130, 1]
te_IN: setting to [7, 19971130, 1]
tg_TJ: setting to [7, 19971130, 1]
the_NP: setting to [7, 19971130, 1]
th_TH: setting to [7, 19971130, 1]
ti_ER: setting to [7, 19971130, 1]
ti_ET: setting to [7, 19971130, 1]
tig_ER: setting to [7, 19971130, 1]
tk_TM: setting to [7, 19971130, 1]
tl_PH: setting to [7, 19971130, 1]
tn_ZA: setting to [7, 19971130, 1]
tr_CY: setting to [7, 19971130, 1]
tr_TR: setting to [7, 19971130, 1]
ts_ZA: setting to [7, 19971130, 1]
tt_RU: setting to [7, 19971130, 1]
tt_RU@iqtelif: setting to [7, 19971130, 1]
ug_CN: setting to [7, 19971130, 1]
ur_IN: setting to [7, 19971130, 1]
ur_PK: setting to [7, 19971130, 1]
uz_UZ: setting to [7, 19971130, 1]
uz_UZ@cyrillic: setting to [7, 19971130, 1]
ve_ZA: setting to [7, 19971130, 1]
vi_VN: setting to [7, 19971130, 1]
wa_BE: setting to [7, 19971130, 4]
wal_ET: setting to [7, 19971130, 1]
wo_SN: setting to [7, 19971130, 1]
xh_ZA: setting to [7, 19971130, 1]
yi_US: setting to [7, 19971130, 1]
yo_NG: setting to [7, 19971130, 1]
yue_HK: setting to [7, 19971130, 1]
zh_CN: setting to [7, 19971130, 1]
zh_HK: setting to [7, 19971130, 1]
zh_SG: setting to [7, 19971130, 1]
zh_TW: setting to [7, 19971130, 1]
zu_ZA: setting to [7, 19971130, 1]
Finally, set first_weekday in all the locales that were omitting it
and wanted something other than the default of 1.
aa_DJ: setting to 7
aa_ER: setting to 2
aa_ER@saaho: setting to 2
ar_AE: setting to 7
ar_BH: setting to 7
ar_DZ: setting to 7
ar_EG: setting to 7
ar_IQ: setting to 7
ar_JO: setting to 7
ar_KW: setting to 7
ar_LB: setting to 2
ar_LY: setting to 7
ar_MA: setting to 7
ar_OM: setting to 7
ar_QA: setting to 7
ar_SD: setting to 7
ar_SS: setting to 2
ar_SY: setting to 7
az_AZ: setting to 2
be_BY: setting to 2
be_BY@latin: setting to 2
ber_DZ: setting to 7
ber_MA: setting to 7
bn_BD: setting to 6
bs_BA: setting to 2
byn_ER: setting to 2
dv_MV: setting to 6
en_NG: setting to 2
es_BO: setting to 2
es_CL: setting to 2
es_EC: setting to 2
es_UY: setting to 2
fo_FO: setting to 2
fr_CH: setting to 2
gd_GB: setting to 2
gez_ER: setting to 2
ha_NG: setting to 2
hr_HR: setting to 2
hy_AM: setting to 2
ig_NG: setting to 2
is_IS: setting to 2
it_CH: setting to 2
ka_GE: setting to 2
kk_KZ: setting to 2
kl_GL: setting to 2
ku_TR: setting to 2
ky_KG: setting to 2
lg_UG: setting to 2
mg_MG: setting to 2
mn_MN: setting to 2
ms_MY: setting to 2
niu_NU: setting to 2
pap_AW: setting to 2
pap_CW: setting to 2
pt_PT: setting to 2
pt_PT@euro: setting to 2
rw_RW: setting to 2
se_NO: setting to 2
si_LK: setting to 2
so_DJ: setting to 7
so_SO: setting to 2
sq_AL: setting to 2
tg_TJ: setting to 2
ti_ER: setting to 2
tig_ER: setting to 2
tk_TM: setting to 2
tt_RU: setting to 2
tt_RU@iqtelif: setting to 2
uz_UZ: setting to 2
uz_UZ@cyrillic: setting to 2
vi_VN: setting to 2
wo_SN: setting to 2
yo_NG: setting to 2
The ISO 30112 standard defines the valid values for the category
keyword as only a few options:
posix:1993
i18n:2004
i18n:2012
The vast majority of locales had changed the "i18n" string to the
name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
date (presumably thinking it should be the date of submission).
Convert all of them to "i18n:2012" for consistency. A follow up
change will update localedef to actually check/validate the field.
These entries have been checked mostly against Wikipedia, but also using
the sources it cites (like the UN and other treaty sources).
Fix incorrect values:
en_BW: changing RB to BW
kl_GL: changing GRO to KN
km_KH: changing LAO to KH
my_MM: changing BA to MYA
oc_FR: changing F to F
tr_CY: changing TR to CY
wae_CH: changing DH to CH
Add missing entries:
aa_DJ: changing to DJI
ak_GH: changing to GH
ar_OM: changing to OM
ar_SS: changing to SUD
ar_YE: changing to YAR
bo_CN: changing to CHN
cmn_TW: changing to RC
dv_MV: changing to MV
dz_BT: changing to BHT
en_AG: changing to AG
es_HN: changing to HN
es_PR: changing to PR
hak_TW: changing to RC
lzh_TW: changing to RC
nan_TW: changing to RC
nan_TW@latin: changing to RC
nl_AW: changing to AUA
pap_AW: changing to AUA
so_DJ: changing to DJI
the_NP: changing to NEP
ug_CN: changing to CHN
yue_HK: changing to HK
zh_CN: changing to CHN
zh_HK: changing to HK
zh_TW: changing to RC
This updates a few locales based on CLDR v29 data.
Add missing fields:
as_IN: changing to 356
dv_MV: changing to 462
kk_KZ: changing to 398
my_MM: changing to 104
rw_RW: changing to 646
tt_RU: changing to 643
Update ones that are wrong:
dz_BT: changing BHU to 064
en_PH: changing 360 to 608
km_KH: changing 418 to 116
ky_KG: changing 643 to 417
tr_CY: changing 792 to 196
wo_SN: changing 450 to 686
As a result of fixing these, I had to update country_ab[23]:
dz_BT: changing BHU to BTN
en_PH: changing ID/IDN to PH/PHL
km_KH: changing LA/LAO to KH/KHM
ky_KG: changing KY/KYR to KG/KGZ
tr_CY: changing TR/TUR to CY/CYP
wo_SN: changing MG/MDG to SN/SEN
Pad with leading zeros to match the standard and other locales:
ber_DZ: changing 12 to 012
ca_AD: changing 20 to 020
en_AG: changing 28 to 028
hy_AM: changing 51 to 051
li_BE: changing 56 to 056
wa_BE: changing 56 to 056
I hand checked the first two sets against ISO 3166-1 directly.
There are only two page sizes that locales use: US-Letter and A4.
For the former, move to copying the en_US locale, while for the
latter, move to copying the i18n locale. This lets us clean up
all the stray comments like FIXME.
There should be no functional differences here.
There are only two measurement systems that locales use: US and metric.
For the former, move to copying the en_US locale, while for the latter,
move to copying the i18n locale. This lets us clean up all the stray
comments like FIXME.
There should be no functional differences here.