[Indic] Relax grammar
Now that we insert dotted-circle, tests break more easily when our indic machine breaks. In particular, a few Devanagari tests were having sequences like "C,H,ZWJ,N", and because of the ZWJ the Nukta does NOT get reordered to before the Halant as the grammar used to expect... Fixup. Another case is as simple as "C,ZWJ,SM". Fixes 10 out of 79 failures: DEVANAGARI: 707325 out of 707394 tests passed. 69 failed (0.00975411%)
This commit is contained in:
parent
aa7141efe4
commit
4ed717ef61
@ -62,12 +62,12 @@ z = ZWJ|ZWNJ; # is_joiner
|
||||
h = H | Coeng; # is_halant_or_coeng
|
||||
reph = (Ra H | Repha); # possible reph
|
||||
|
||||
cn = c.n?;
|
||||
cn = c.ZWJ?.n?;
|
||||
forced_rakar = ZWJ H ZWJ Ra;
|
||||
matra_group = z{0,3}.M.N?.(H | forced_rakar)?;
|
||||
syllable_tail = (Coeng (cn|V))? (SM.ZWNJ?)? (VD VD?)?;
|
||||
place_holder = NBSP | DOTTEDCIRCLE;
|
||||
halant_group = (z?.h.ZWJ?);
|
||||
halant_group = (z?.h.(ZWJ.N?)?);
|
||||
final_halant_group = halant_group | h.ZWNJ;
|
||||
halant_or_matra_group = (final_halant_group | matra_group{0,4});
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user