These comments are useless and only confusing. The encodings used to
create binary locales from source locales are listed in the
localedata/SUPPORTED file. The source files itself are ASCII or UTF-8
encoded where non-ASCII UTF-8 is currently only used in comments. If
all locale source files are UTF-8 anyway, there is no need to specify
that in a special comment.
A handful of regexes were allowing +1 for yesexpr and -0 for noexpr,
and it's the i18n definition. Standardize all locales by allowing
these language-independent values in them.
Example change for en_US goes from ^[yY] to ^[+1yY], and from ^[nN]
to ^[-0nN].
I've spot checked a number of these, including some that were def
wrong (like ff_SN). It also fixes all open week-related bugs.
Since ff_SN is the only one that changes its base date, I also made
sure that its ordering of day translations were correct. Looks like
another case Petr brought up where the week field was not actually
checked against the day arrays.
I also took the opportunity to drop first_weekday/first_workday when
the value aligned with the defaults (1 & 2 respectively). This didn't
impact too many locales In practice because the majority omitted them
already.
A few locales were defining some values incorrectly for their region:
ak_GH: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ak_GH: first_weekday: changing 1 to 2
ayc_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
bem_ZM: week: changing [7, 19971130, 4] to [7, 19971130, 1]
bem_ZM: first_weekday: changing 1 to 2
en_IE: first_weekday: changing 2 to 1
en_US: week: changing [7, 19971130, 7] to [7, 19971130, 1]
es_CO: first_weekday: changing 2 to 1
es_ES: week: changing [7, 19971130, 5] to [7, 19971130, 4]
ff_SN: week: changing [7, 19971129, 1] to [7, 19971130, 1]
ff_SN: first_weekday: changing 1 to 2
ga_IE: first_weekday: changing 2 to 1
ht_HT: week: changing [7, 19971130, 7] to [7, 19971130, 1]
ht_HT: first_weekday: changing 1 to 2
mk_MK: week: changing [7, 19971130, 4] to [7, 19971130, 1]
mt_MT: first_weekday: changing 2 to 1
quz_PE: week: changing [7, 19971130, 7] to [7, 19971130, 1]
sr_ME: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sr_RS@latin: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: week: changing [7, 19971130, 4] to [7, 19971130, 1]
sw_KE: first_weekday: changing 2 to 1
uk_UA: week: changing [7, 19971130, 4] to [7, 19971130, 1]
unm_US: week: changing [7, 19971130, 4] to [7, 19971130, 1]
Some locales were copying locales that had the wrong week settings, so
that content had to be duplicated so the values could be adjusted:
el_CY: week: setting to [7, 19971130, 1]
en_AG: week: setting to [7, 19971130, 1]
en_AG: first_weekday: changing 2 to 1
en_ZM: week: setting to [7, 19971130, 1]
es_CU: week: setting to [7, 19971130, 1]
nl_AW: week: setting to [7, 19971130, 1]
sw_TZ: first_weekday: setting to 2
ta_LK: first_weekday: setting to 2
The majority of locales were omitting the week field thus getting the
default [7, 19971130, 0 (localedef) / 7 (ISO standard)]. Unfortunately,
neither of those are used by any locales, so we end up having to define
the field just to se the ndays field. In practice, this rarely matters
due to it usage, and the first two fields match the defaults.
aa_DJ: setting to [7, 19971130, 1]
aa_ER: setting to [7, 19971130, 1]
aa_ER@saaho: setting to [7, 19971130, 1]
aa_ET: setting to [7, 19971130, 1]
af_ZA: setting to [7, 19971130, 1]
am_ET: setting to [7, 19971130, 1]
an_ES: setting to [7, 19971130, 4]
anp_IN: setting to [7, 19971130, 1]
ar_AE: setting to [7, 19971130, 1]
ar_BH: setting to [7, 19971130, 1]
ar_DZ: setting to [7, 19971130, 1]
ar_EG: setting to [7, 19971130, 1]
ar_IN: setting to [7, 19971130, 1]
ar_IQ: setting to [7, 19971130, 1]
ar_JO: setting to [7, 19971130, 1]
ar_KW: setting to [7, 19971130, 1]
ar_LB: setting to [7, 19971130, 1]
ar_LY: setting to [7, 19971130, 1]
ar_MA: setting to [7, 19971130, 1]
ar_OM: setting to [7, 19971130, 1]
ar_QA: setting to [7, 19971130, 1]
ar_SA: setting to [7, 19971130, 1]
ar_SD: setting to [7, 19971130, 1]
ar_SS: setting to [7, 19971130, 1]
ar_SY: setting to [7, 19971130, 1]
ar_TN: setting to [7, 19971130, 1]
ar_YE: setting to [7, 19971130, 1]
as_IN: setting to [7, 19971130, 1]
ast_ES: setting to [7, 19971130, 4]
az_AZ: setting to [7, 19971130, 1]
be_BY: setting to [7, 19971130, 1]
be_BY@latin: setting to [7, 19971130, 1]
ber_DZ: setting to [7, 19971130, 1]
ber_MA: setting to [7, 19971130, 1]
bg_BG: setting to [7, 19971130, 4]
bhb_IN: setting to [7, 19971130, 1]
bho_IN: setting to [7, 19971130, 1]
bn_BD: setting to [7, 19971130, 1]
bn_IN: setting to [7, 19971130, 1]
bo_CN: setting to [7, 19971130, 1]
br_FR: setting to [7, 19971130, 4]
brx_IN: setting to [7, 19971130, 1]
bs_BA: setting to [7, 19971130, 1]
byn_ER: setting to [7, 19971130, 1]
ca_AD: setting to [7, 19971130, 4]
ca_ES: setting to [7, 19971130, 4]
ca_ES@euro: setting to [7, 19971130, 4]
ca_FR: setting to [7, 19971130, 4]
ca_IT: setting to [7, 19971130, 4]
ce_RU: setting to [7, 19971130, 1]
cmn_TW: setting to [7, 19971130, 1]
crh_UA: setting to [7, 19971130, 1]
cv_RU: setting to [7, 19971130, 1]
cy_GB: setting to [7, 19971130, 4]
de_BE: setting to [7, 19971130, 4]
de_LU: setting to [7, 19971130, 4]
doi_IN: setting to [7, 19971130, 1]
dv_MV: setting to [7, 19971130, 1]
dz_BT: setting to [7, 19971130, 1]
el_GR: setting to [7, 19971130, 4]
el_GR@euro: setting to [7, 19971130, 4]
en_AU: setting to [7, 19971130, 1]
en_BW: setting to [7, 19971130, 1]
en_CA: setting to [7, 19971130, 1]
en_HK: setting to [7, 19971130, 1]
en_IE: setting to [7, 19971130, 4]
en_IN: setting to [7, 19971130, 1]
en_NG: setting to [7, 19971130, 1]
en_NZ: setting to [7, 19971130, 1]
en_PH: setting to [7, 19971130, 1]
en_SG: setting to [7, 19971130, 1]
en_ZA: setting to [7, 19971130, 1]
en_ZW: setting to [7, 19971130, 1]
es_AR: setting to [7, 19971130, 1]
es_BO: setting to [7, 19971130, 1]
es_CL: setting to [7, 19971130, 1]
es_CO: setting to [7, 19971130, 1]
es_CR: setting to [7, 19971130, 1]
es_DO: setting to [7, 19971130, 1]
es_EC: setting to [7, 19971130, 1]
es_ES@euro: setting to [7, 19971130, 4]
es_GT: setting to [7, 19971130, 1]
es_HN: setting to [7, 19971130, 1]
es_MX: setting to [7, 19971130, 1]
es_NI: setting to [7, 19971130, 1]
es_PA: setting to [7, 19971130, 1]
es_PE: setting to [7, 19971130, 1]
es_PR: setting to [7, 19971130, 1]
es_PY: setting to [7, 19971130, 1]
es_SV: setting to [7, 19971130, 1]
es_US: setting to [7, 19971130, 1]
es_UY: setting to [7, 19971130, 1]
es_VE: setting to [7, 19971130, 1]
eu_ES: setting to [7, 19971130, 4]
fa_IR: setting to [7, 19971130, 1]
fil_PH: setting to [7, 19971130, 1]
fo_FO: setting to [7, 19971130, 4]
fr_CA: setting to [7, 19971130, 1]
fr_CH: setting to [7, 19971130, 4]
fr_LU: setting to [7, 19971130, 4]
fy_NL: setting to [7, 19971130, 4]
ga_IE: setting to [7, 19971130, 4]
gd_GB: setting to [7, 19971130, 4]
gez_ER: setting to [7, 19971130, 1]
gez_ET: setting to [7, 19971130, 1]
gl_ES: setting to [7, 19971130, 4]
gu_IN: setting to [7, 19971130, 1]
gv_GB: setting to [7, 19971130, 4]
hak_TW: setting to [7, 19971130, 1]
ha_NG: setting to [7, 19971130, 1]
he_IL: setting to [7, 19971130, 1]
hi_IN: setting to [7, 19971130, 1]
hne_IN: setting to [7, 19971130, 1]
hr_HR: setting to [7, 19971130, 1]
hy_AM: setting to [7, 19971130, 1]
id_ID: setting to [7, 19971130, 1]
ig_NG: setting to [7, 19971130, 1]
ik_CA: setting to [7, 19971130, 1]
is_IS: setting to [7, 19971130, 4]
it_CH: setting to [7, 19971130, 4]
it_IT: setting to [7, 19971130, 4]
it_IT@euro: setting to [7, 19971130, 4]
iu_CA: setting to [7, 19971130, 1]
ja_JP: setting to [7, 19971130, 1]
ka_GE: setting to [7, 19971130, 1]
kk_KZ: setting to [7, 19971130, 1]
kl_GL: setting to [7, 19971130, 1]
km_KH: setting to [7, 19971130, 1]
kn_IN: setting to [7, 19971130, 1]
kok_IN: setting to [7, 19971130, 1]
ko_KR: setting to [7, 19971130, 1]
ks_IN: setting to [7, 19971130, 1]
ks_IN@devanagari: setting to [7, 19971130, 1]
ku_TR: setting to [7, 19971130, 1]
kw_GB: setting to [7, 19971130, 4]
ky_KG: setting to [7, 19971130, 1]
lg_UG: setting to [7, 19971130, 1]
lij_IT: setting to [7, 19971130, 4]
lo_LA: setting to [7, 19971130, 1]
lt_LT: setting to [7, 19971130, 4]
lv_LV: setting to [7, 19971130, 1]
lzh_TW: setting to [7, 19971130, 1]
mag_IN: setting to [7, 19971130, 1]
mai_IN: setting to [7, 19971130, 1]
mg_MG: setting to [7, 19971130, 1]
mhr_RU: setting to [7, 19971130, 1]
mi_NZ: setting to [7, 19971130, 1]
ml_IN: setting to [7, 19971130, 1]
mni_IN: setting to [7, 19971130, 1]
mn_MN: setting to [7, 19971130, 1]
mr_IN: setting to [7, 19971130, 1]
ms_MY: setting to [7, 19971130, 1]
mt_MT: setting to [7, 19971130, 1]
my_MM: setting to [7, 19971130, 1]
nan_TW: setting to [7, 19971130, 1]
nan_TW@latin: setting to [7, 19971130, 1]
ne_NP: setting to [7, 19971130, 1]
nhn_MX: setting to [7, 19971130, 1]
niu_NU: setting to [7, 19971130, 1]
niu_NZ: setting to [7, 19971130, 1]
nl_BE: setting to [7, 19971130, 4]
nl_BE@euro: setting to [7, 19971130, 4]
nr_ZA: setting to [7, 19971130, 1]
nso_ZA: setting to [7, 19971130, 1]
oc_FR: setting to [7, 19971130, 4]
om_ET: setting to [7, 19971130, 1]
om_KE: setting to [7, 19971130, 1]
or_IN: setting to [7, 19971130, 1]
os_RU: setting to [7, 19971130, 1]
pa_IN: setting to [7, 19971130, 1]
pap_AW: setting to [7, 19971130, 1]
pap_CW: setting to [7, 19971130, 1]
pa_PK: setting to [7, 19971130, 1]
ps_AF: setting to [7, 19971130, 1]
pt_BR: setting to [7, 19971130, 1]
pt_PT: setting to [7, 19971130, 4]
pt_PT@euro: setting to [7, 19971130, 4]
raj_IN: setting to [7, 19971130, 1]
ro_RO: setting to [7, 19971130, 1]
ru_RU: setting to [7, 19971130, 1]
ru_UA: setting to [7, 19971130, 1]
rw_RW: setting to [7, 19971130, 1]
sa_IN: setting to [7, 19971130, 1]
sat_IN: setting to [7, 19971130, 1]
sd_IN: setting to [7, 19971130, 1]
sd_IN@devanagari: setting to [7, 19971130, 1]
se_NO: setting to [7, 19971130, 4]
shs_CA: setting to [7, 19971130, 1]
sid_ET: setting to [7, 19971130, 1]
si_LK: setting to [7, 19971130, 1]
sl_SI: setting to [7, 19971130, 1]
so_DJ: setting to [7, 19971130, 1]
so_ET: setting to [7, 19971130, 1]
so_KE: setting to [7, 19971130, 1]
so_SO: setting to [7, 19971130, 1]
sq_AL: setting to [7, 19971130, 1]
ss_ZA: setting to [7, 19971130, 1]
st_ZA: setting to [7, 19971130, 1]
sv_FI: setting to [7, 19971130, 4]
sv_SE: setting to [7, 19971130, 4]
ta_IN: setting to [7, 19971130, 1]
tcy_IN: setting to [7, 19971130, 1]
te_IN: setting to [7, 19971130, 1]
tg_TJ: setting to [7, 19971130, 1]
the_NP: setting to [7, 19971130, 1]
th_TH: setting to [7, 19971130, 1]
ti_ER: setting to [7, 19971130, 1]
ti_ET: setting to [7, 19971130, 1]
tig_ER: setting to [7, 19971130, 1]
tk_TM: setting to [7, 19971130, 1]
tl_PH: setting to [7, 19971130, 1]
tn_ZA: setting to [7, 19971130, 1]
tr_CY: setting to [7, 19971130, 1]
tr_TR: setting to [7, 19971130, 1]
ts_ZA: setting to [7, 19971130, 1]
tt_RU: setting to [7, 19971130, 1]
tt_RU@iqtelif: setting to [7, 19971130, 1]
ug_CN: setting to [7, 19971130, 1]
ur_IN: setting to [7, 19971130, 1]
ur_PK: setting to [7, 19971130, 1]
uz_UZ: setting to [7, 19971130, 1]
uz_UZ@cyrillic: setting to [7, 19971130, 1]
ve_ZA: setting to [7, 19971130, 1]
vi_VN: setting to [7, 19971130, 1]
wa_BE: setting to [7, 19971130, 4]
wal_ET: setting to [7, 19971130, 1]
wo_SN: setting to [7, 19971130, 1]
xh_ZA: setting to [7, 19971130, 1]
yi_US: setting to [7, 19971130, 1]
yo_NG: setting to [7, 19971130, 1]
yue_HK: setting to [7, 19971130, 1]
zh_CN: setting to [7, 19971130, 1]
zh_HK: setting to [7, 19971130, 1]
zh_SG: setting to [7, 19971130, 1]
zh_TW: setting to [7, 19971130, 1]
zu_ZA: setting to [7, 19971130, 1]
Finally, set first_weekday in all the locales that were omitting it
and wanted something other than the default of 1.
aa_DJ: setting to 7
aa_ER: setting to 2
aa_ER@saaho: setting to 2
ar_AE: setting to 7
ar_BH: setting to 7
ar_DZ: setting to 7
ar_EG: setting to 7
ar_IQ: setting to 7
ar_JO: setting to 7
ar_KW: setting to 7
ar_LB: setting to 2
ar_LY: setting to 7
ar_MA: setting to 7
ar_OM: setting to 7
ar_QA: setting to 7
ar_SD: setting to 7
ar_SS: setting to 2
ar_SY: setting to 7
az_AZ: setting to 2
be_BY: setting to 2
be_BY@latin: setting to 2
ber_DZ: setting to 7
ber_MA: setting to 7
bn_BD: setting to 6
bs_BA: setting to 2
byn_ER: setting to 2
dv_MV: setting to 6
en_NG: setting to 2
es_BO: setting to 2
es_CL: setting to 2
es_EC: setting to 2
es_UY: setting to 2
fo_FO: setting to 2
fr_CH: setting to 2
gd_GB: setting to 2
gez_ER: setting to 2
ha_NG: setting to 2
hr_HR: setting to 2
hy_AM: setting to 2
ig_NG: setting to 2
is_IS: setting to 2
it_CH: setting to 2
ka_GE: setting to 2
kk_KZ: setting to 2
kl_GL: setting to 2
ku_TR: setting to 2
ky_KG: setting to 2
lg_UG: setting to 2
mg_MG: setting to 2
mn_MN: setting to 2
ms_MY: setting to 2
niu_NU: setting to 2
pap_AW: setting to 2
pap_CW: setting to 2
pt_PT: setting to 2
pt_PT@euro: setting to 2
rw_RW: setting to 2
se_NO: setting to 2
si_LK: setting to 2
so_DJ: setting to 7
so_SO: setting to 2
sq_AL: setting to 2
tg_TJ: setting to 2
ti_ER: setting to 2
tig_ER: setting to 2
tk_TM: setting to 2
tt_RU: setting to 2
tt_RU@iqtelif: setting to 2
uz_UZ: setting to 2
uz_UZ@cyrillic: setting to 2
vi_VN: setting to 2
wo_SN: setting to 2
yo_NG: setting to 2
The ISO 30112 standard defines the valid values for the category
keyword as only a few options:
posix:1993
i18n:2004
i18n:2012
The vast majority of locales had changed the "i18n" string to the
name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
date (presumably thinking it should be the date of submission).
Convert all of them to "i18n:2012" for consistency. A follow up
change will update localedef to actually check/validate the field.
These entries have been checked mostly against Wikipedia, but also using
the sources it cites (like the UN and other treaty sources).
Fix incorrect values:
en_BW: changing RB to BW
kl_GL: changing GRO to KN
km_KH: changing LAO to KH
my_MM: changing BA to MYA
oc_FR: changing F to F
tr_CY: changing TR to CY
wae_CH: changing DH to CH
Add missing entries:
aa_DJ: changing to DJI
ak_GH: changing to GH
ar_OM: changing to OM
ar_SS: changing to SUD
ar_YE: changing to YAR
bo_CN: changing to CHN
cmn_TW: changing to RC
dv_MV: changing to MV
dz_BT: changing to BHT
en_AG: changing to AG
es_HN: changing to HN
es_PR: changing to PR
hak_TW: changing to RC
lzh_TW: changing to RC
nan_TW: changing to RC
nan_TW@latin: changing to RC
nl_AW: changing to AUA
pap_AW: changing to AUA
so_DJ: changing to DJI
the_NP: changing to NEP
ug_CN: changing to CHN
yue_HK: changing to HK
zh_CN: changing to CHN
zh_HK: changing to HK
zh_TW: changing to RC
There are only two measurement systems that locales use: US and metric.
For the former, move to copying the en_US locale, while for the latter,
move to copying the i18n locale. This lets us clean up all the stray
comments like FIXME.
There should be no functional differences here.