[Indic] Implement Reph+Ya-Phalaa interaction

The sequence Ra,H,Ya in Bengali is ambigious and Unicode encoded that to
get Ya-Phalaa, one would place ZWJ before Halant.  Ie. a ZWJ,H sequence
requests subjoining, while a H,ZWJ requests Half form.  Implement that.

Bengali failures go down from 377 to 297 (0.0838308%).
Gujarati is down by 4 to 17 (0.0046384%).
Kannada is down by 226 to 957 (0.100534%).

Current status:

BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1046857 out of 1048416 tests passed. 1559 failed (0.148701%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
This commit is contained in:
Behdad Esfahbod 2012-07-24 03:04:36 -04:00
parent dff0ece11d
commit 88f413b56f
2 changed files with 12 additions and 2 deletions

View File

@ -553,8 +553,14 @@ initial_reordering_consonant_syllable (const hb_ot_map_t *map, hb_buffer_t *buff
} }
else else
{ {
/* A ZWJ stops the base search, and requests an explicit half form. */ /* A ZWJ after a Halant stops the base search, and requests an explicit
if (info[i].indic_category() == OT_ZWJ) * half form.
* A ZWJ before a Halant, requests a subjoined form instead, and hence
* search continues. This is particularly important for Bengali
* sequence Ra,H,Ya that shouls form Ya-Phalaa by subjoining Ya. */
if (start < i &&
info[i].indic_category() == OT_ZWJ &&
info[i - 1].indic_category() == OT_H)
break; break;
} }
} while (i > limit); } while (i > limit);

View File

@ -8,3 +8,7 @@
র্কৈ র্কৈ
র্কো র্কো
র্কৌ র্কৌ
র্য
র্‍য
র‍্য
র্র‍্য