[Indic] Further adjust base algorithm for Sinhala
Apparently if there is C,V,ZWJ,C, the first C will be base, but if it's C,ZWJ,V,C, the second one will be. Note that Uniscribe implements this differently, by breaking syllable in the case of C,ZWJ,V,C and putting the first consonant in one syllable and the rest in the next syllable. Sinhala failures down from 208 to 158 (0.0581209%). No changes to Khmer.
This commit is contained in:
parent
73d71cc527
commit
71fd5e80ad
@ -560,12 +560,15 @@ initial_reordering_consonant_syllable (const hb_ot_map_t *map, hb_buffer_t *buff
|
||||
base = limit;
|
||||
|
||||
/* Find the last base consonant that is not blocked by ZWJ. If there is
|
||||
* a ZWJ before a bse consonant, that would request a subjoined form. */
|
||||
* a ZWJ right before a base consonant, that would request a subjoined form. */
|
||||
for (unsigned int i = limit; i < end; i++)
|
||||
if (is_consonant (info[i]) && info[i].indic_position() == POS_BASE_C)
|
||||
base = i;
|
||||
else if (info[i].indic_category() == OT_ZWJ)
|
||||
break;
|
||||
{
|
||||
if (limit < i && info[i - 1].indic_category() == OT_ZWJ)
|
||||
break;
|
||||
else
|
||||
base = i;
|
||||
}
|
||||
|
||||
/* Mark all subsequent consonants as below. */
|
||||
for (unsigned int i = base + 1; i < end; i++)
|
||||
|
@ -32,3 +32,6 @@
|
||||
ග්යෙ
|
||||
ර්ය්ය
|
||||
එඬේ
|
||||
න්ගේ
|
||||
න්ගේ
|
||||
න්ගේ
|
||||
|
Loading…
Reference in New Issue
Block a user