[Indic] Relax grammar

Now that we insert dotted-circle, tests break more easily when our indic
machine breaks.

In particular, a few Devanagari tests were having sequences like
"C,H,ZWJ,N", and because of the ZWJ the Nukta does NOT get reordered to
before the Halant as the grammar used to expect...  Fixup.

Another case is as simple as "C,ZWJ,SM".

Fixes 10 out of 79 failures:

DEVANAGARI: 707325 out of 707394 tests passed. 69 failed (0.00975411%)
This commit is contained in:
Behdad Esfahbod 2012-09-05 17:21:17 -04:00
parent aa7141efe4
commit 4ed717ef61

View File

@ -62,12 +62,12 @@ z = ZWJ|ZWNJ; # is_joiner
h = H | Coeng; # is_halant_or_coeng
reph = (Ra H | Repha); # possible reph
cn = c.n?;
cn = c.ZWJ?.n?;
forced_rakar = ZWJ H ZWJ Ra;
matra_group = z{0,3}.M.N?.(H | forced_rakar)?;
syllable_tail = (Coeng (cn|V))? (SM.ZWNJ?)? (VD VD?)?;
place_holder = NBSP | DOTTEDCIRCLE;
halant_group = (z?.h.ZWJ?);
halant_group = (z?.h.(ZWJ.N?)?);
final_halant_group = halant_group | h.ZWNJ;
halant_or_matra_group = (final_halant_group | matra_group{0,4});