Commit Graph

443 Commits

Author SHA1 Message Date
Behdad Esfahbod
7f852b644b Fix compiler warnings 2012-05-11 23:10:31 +02:00
Behdad Esfahbod
6a091df9b4 [Indic] Disambiguate sub vs post vs above matras
Bengali is at *just* above 5% now.
2012-05-11 21:42:27 +02:00
Behdad Esfahbod
9d0d319a4a [Indic] Position Bengali Reph before matras 2012-05-11 21:36:32 +02:00
Behdad Esfahbod
f893672511 [Indic] Start categorizing Reph per script 2012-05-11 21:10:03 +02:00
Behdad Esfahbod
a913b024d8 [Indic] Apply 'init' feature for Bengali
Error down from 20% to 7%.
2012-05-11 20:59:26 +02:00
Behdad Esfahbod
eed903b164 [Indic] Refactor for the arrival of 'init' feature
Yep, on Bengali now!
2012-05-11 20:50:53 +02:00
Behdad Esfahbod
18c06e189b [Indic] Add Uniscribe bug feature for dotted circle
For dotted-circle independent clusters, Uniscribe does no Reph shaping
for the exact sequence Ra+Halant+25CC.  Which also is the only possible
sequence with 25CC at the end.
2012-05-11 20:02:14 +02:00
Behdad Esfahbod
0831061efb [Indic] Refactoring 2012-05-11 19:07:58 +02:00
Behdad Esfahbod
7ea58db311 Minor 2012-05-11 18:58:57 +02:00
Behdad Esfahbod
3399a06e70 [Indic] Fix U+0952 and similar classification to match Uniscribe
See comments.
2012-05-11 17:54:26 +02:00
Behdad Esfahbod
11aa3ef18d [Indic] Treat U+0951..U+0954 all similar to U+0952 2012-05-11 17:30:48 +02:00
Behdad Esfahbod
892eb78782 [Indic] Implement Uniscribe Reph+Matra+Halant bug feature 2012-05-11 16:54:40 +02:00
Behdad Esfahbod
67ea29af49 [Indic] Add example of different Uniscribe behavior 2012-05-11 16:51:23 +02:00
Behdad Esfahbod
ebe29733d4 [Indic] Add runtime Uniscribe bug compatibility mode!
Enable by setting envvar:

  HB_OT_INDIC_OPTIONS=uniscribe-bug-compatible

Plus, LeftMatra+Halant "feature".
2012-05-11 16:43:12 +02:00
Behdad Esfahbod
616e692e29 [Indic] Add #define UNISCRIBE_BUG_COMPATIBLE 1 2012-05-11 16:25:02 +02:00
Behdad Esfahbod
6782bdae3b [Indic] Fix Left Matra + Halant reordering
As can be seen in: U+092B,U+093F,U+094D
2012-05-11 16:23:43 +02:00
Behdad Esfahbod
3c2ea9481b Minor 2012-05-11 16:23:38 +02:00
Behdad Esfahbod
668c6046c1 [Indic] Apply Reph mask to all POS_REPH glyphs
Needed for upcoming changes to GSUB/GPOS mask matching.
2012-05-11 15:34:13 +02:00
Behdad Esfahbod
cee7187447 [Indic] Move syllable tracking from Indic to generic layer
This is to incorporate it into GSUB/GPOS processing.
2012-05-11 11:41:39 +02:00
Behdad Esfahbod
3bf27a9f0e [Indic] Disable conjuncts when a ZWJ happens
Not that the code makes any difference since the presence of ZWJ itself
causes the ligature to fail to match anyway.
2012-05-11 11:17:23 +02:00
Behdad Esfahbod
c6d904d67d [Indic] Fix bitops typo!
Another 1000 down!
2012-05-11 11:07:40 +02:00
Behdad Esfahbod
02b2922fbf [Indic] Towards better Reph positioning
Fixed for Deva cases with two full-form consonants.  Failures **way** down.
Not much left to go :-).
2012-05-10 21:44:50 +02:00
Behdad Esfahbod
2b70df5cc0 [Indic] Add note re Uniscribe clusters 2012-05-10 18:38:22 +02:00
Behdad Esfahbod
21d2803133 [Indic] Do clustering like Uniscribe does
Hindi Wikipedia failures down to 6639 (0.938381%)!
2012-05-10 18:34:34 +02:00
Behdad Esfahbod
8df5636968 [Indic] Reorder Reph to before the Halant after Matras
Uniscribe doesn't do it, but we want to do as it gives the Reph the
opportunity to interact with the Matras.  Test with mangal for example.
Sequence: <0930,094d,0915,094b,094d>
In test suite already.
2012-05-10 15:41:04 +02:00
Behdad Esfahbod
daf3234bdc [Indic] Don't clear the mask for Reph
This was removing the mandatory global 1 bit in the mask and hence
disabling GPOS for Reph!
2012-05-10 15:28:27 +02:00
Behdad Esfahbod
7708ee23cb [Indic] Improve Left Matra repositioning
Move its dependents too.
2012-05-10 14:48:25 +02:00
Behdad Esfahbod
dbb105883c [Indic] Do Reph repositioning in final reordering like the spec says
This introduced a failure, which we tracked down to a test case like this:

  U+092E,U+094B,U+094D,U+0930

The final character is a Ra that should be put in a syllable of it's
own.  And we do.  But it will interact with the Halant before it.  So
now we finally are convinced that we have to limit features to syllable
boundaries.  That's coming after lunch!
2012-05-10 13:45:52 +02:00
Behdad Esfahbod
4705a70269 Minor 2012-05-10 13:09:08 +02:00
Behdad Esfahbod
4ac9e98d9d [Indic] Reorder left matras to be closer to base 2012-05-10 12:53:53 +02:00
Behdad Esfahbod
1a1fa8c655 [Indic] Treat the standalone cluster case reusing the consonant logic 2012-05-10 12:21:30 +02:00
Behdad Esfahbod
190eb31a16 [Indic] Minor 2012-05-10 12:21:30 +02:00
Behdad Esfahbod
c5306b6861 [Indic] Handle Vowel syllables
Reusing the consonant logic!
2012-05-10 12:21:30 +02:00
Behdad Esfahbod
6d8e0cb74c [Indic] Simplify Reph logic 2012-05-10 11:41:51 +02:00
Behdad Esfahbod
3d25079f8d [Indic] Don't form Reph is Ra is the only consonant in the syllable 2012-05-10 11:37:42 +02:00
Behdad Esfahbod
b99d63ae11 [Indic] Increase max syllable length
20 was way too low, one could hit a syllable with 7ish consonants with it.
2012-05-10 11:32:52 +02:00
Behdad Esfahbod
a391ff50b9 [Indic] Adjust base after sorting 2012-05-10 11:31:20 +02:00
Behdad Esfahbod
d3637edb24 [Indic] Don't return for long syllables. Just not sort. 2012-05-10 10:51:38 +02:00
Behdad Esfahbod
ef24cc8c8e [Indic] Towards multi-cluster syllables and final reordering 2012-05-09 18:10:20 +02:00
Behdad Esfahbod
92332e5116 Minor 2012-05-09 17:40:00 +02:00
Behdad Esfahbod
dbccf87eef [Indic] Make room for more reordering positions 2012-05-09 17:24:39 +02:00
Behdad Esfahbod
d4480ace7f [Indic] Improve matra vs consonant ordering
Another 1.5% down.
2012-05-09 15:59:47 +02:00
Behdad Esfahbod
33c92e7695 [Indic] Categorize Anudatta 2012-05-09 15:41:51 +02:00
Behdad Esfahbod
19d984edaa [Indic] Make sure Reph jumps over all matras to the right
Another 12 thousand failures gone! (78 to go)
2012-05-09 15:21:13 +02:00
Behdad Esfahbod
9034641333 [Indic] Keep Vedic signs at the right too 2012-05-09 15:04:58 +02:00
Behdad Esfahbod
d1deaa2f5b Replace zerowidth invisible chars with a zero-advance space glyph
Like Uniscribe does.
2012-05-09 15:04:13 +02:00
Behdad Esfahbod
49e5da1591 [indic] Keep the syllable modifier marks to the right
Shaping failures on Hindi Wikipedia go down from 25% to 14%!
2012-05-09 13:23:27 +02:00
Behdad Esfahbod
76b3409de6 [indic] Better Reph matching 2012-05-09 11:52:32 +02:00
Behdad Esfahbod
df6d45c693 Minor 2012-05-09 11:38:31 +02:00
Behdad Esfahbod
412b91889d [indic] Apply Indic features in order 2012-05-09 11:07:18 +02:00
Behdad Esfahbod
1ac075b227 [indic] Apply rakaar forms
Fixes 10% of the failures against all of Hindi Wikipedia!
2012-05-09 11:06:47 +02:00
Behdad Esfahbod
3ed4634ec3 Add Indic inspection tool 2012-04-19 22:35:01 -04:00
Behdad Esfahbod
a06411ecf9 Minor matra renumbering
Should have no visible effect.
2012-04-19 22:28:25 -04:00
Behdad Esfahbod
c65662b71e Fix left-matra positioning in Indic
Fixes 200 failures out of previous 4290 cases in the OO.o Indic
dictionary (of ~16000 entries).
2012-04-12 09:31:55 -04:00
Behdad Esfahbod
acd88e659f In Arabic fallback shaping, check that the font has glyph for new char 2012-04-10 18:02:20 -04:00
Behdad Esfahbod
11138ccff7 Add normalize mode
In preparation for Hangul shaper.
2012-04-05 17:25:19 -04:00
Behdad Esfahbod
e8eedf2687 Avoid enum trailing commas
Based on patch from Jonathan Kew.
2012-01-16 16:39:40 -05:00
Behdad Esfahbod
0a965eee88 Minor 2011-09-19 16:53:47 -04:00
Behdad Esfahbod
c605bbbb6d Remove C++ guards from source files
Where causing issues for people with MSVC.
2011-08-04 20:00:53 -04:00
Behdad Esfahbod
a91c58bf98 [Indic] Disable CJCT-disabling logic
Read comment.
2011-08-01 16:30:11 -04:00
Behdad Esfahbod
5e72071062 [Indic] Stop looking for base upon seeing joiners
Not sure where this is documented, but I remember this being the desired
behavior.

test-shape-complex failures are down from 48 to 46.  Meh.
2011-07-31 17:52:44 -04:00
Behdad Esfahbod
281683995a Cosmetic 2011-07-31 16:00:35 -04:00
Behdad Esfahbod
6b37bc8084 [Indic] Fix ZWJ/ZWNJ application
Not quite working just yet.  False alarm re 10 failures.  It was
crashing.  Ouch!  Back to 48 failures.
2011-07-31 15:57:00 -04:00
Behdad Esfahbod
e7be057024 [Indic] Add Final Reordering rules into comments
Not applied yet.
2011-07-31 15:22:46 -04:00
Behdad Esfahbod
cfd4382ec1 [Indic] Handle Reph when determining base consonant 2011-07-31 15:08:40 -04:00
Behdad Esfahbod
97158392a5 [Indic] Ra is a consonant too 2011-07-31 15:01:28 -04:00
Behdad Esfahbod
0d8f8a177c [Indic] Fix reph inhibition logic 2011-07-31 14:57:59 -04:00
Behdad Esfahbod
9da0487cd4 [Indic] Support ZWJ/ZWNJ
Brings test-shape-complex failures down from 52 to 10!

I hereby declare harfbuzz-ng supporting Indic!
2011-07-31 13:46:44 -04:00
Behdad Esfahbod
9ee27a928a [Indic] Suppress reph formation upon joiners 2011-07-31 11:10:14 -04:00
Behdad Esfahbod
8354e004e5 Un-Ra U+09F1. According to the test suite this is correct.
But I'm not sure...  Down from 54 failures to 52.
2011-07-31 02:24:51 -04:00
Behdad Esfahbod
ba7e85c104 Cosmetic 2011-07-30 21:11:53 -04:00
Behdad Esfahbod
f5bc2725cb [Indic] For old-style Indic tables, move Halant around
In old-style Indic OT standards, the post-base Halants are moved after
their base.  Emulate that by moving first post-base Halant to
post-last-consonant.

Brings test-shape-complex failures down from 88 to 54.  Getting there!
2011-07-30 21:08:10 -04:00
Behdad Esfahbod
fd06bf5611 [Indic] Handle initial Ra+Halant in scripts that support Reph
Brings test-shape-complex failures down from 104 to 92.  Way to go!
2011-07-30 20:14:44 -04:00
Behdad Esfahbod
ee58f3bc75 Minor 2011-07-30 19:15:53 -04:00
Behdad Esfahbod
352372ae5e [Indic] Categorize Ra in scripts that have Reph
Is the categorization correct?  I don't know.
2011-07-30 19:04:02 -04:00
Behdad Esfahbod
45d6f29f15 [Indic] Reorder matras
Number of failing shape-complex tests goes from 125 down to 94.

Next: Add Ra handling and it's fair to say we kinda support Indic :).
2011-07-30 14:44:30 -04:00
Behdad Esfahbod
743807a3ce [Indic] Apply Indic features
Find the base consonant and apply basic Indic features accordingly.
Nothing complete, but does something for now.  Specifically:
no Ra handling right now, and no ZWJ/ZWNJ.

Number of failing shape-complex tests goes from 174 down to 125.

Next: reorder matras.
2011-07-29 16:46:09 -04:00
Behdad Esfahbod
9f9bcceca6 Register buffer vars in Indic shaper 2011-07-28 17:07:50 -04:00
Behdad Esfahbod
b65c06025d Formalize buffer var allocations 2011-07-28 16:49:29 -04:00
Behdad Esfahbod
02cdf743c2 Add prefer_decomposed() complex-shaper callback
This allows the Indic shaper to request decomposed characters.  This will
handle split matra for free.  Other shapers prefer precomposed
characters.
2011-07-21 12:23:12 -04:00
Behdad Esfahbod
a54a5505a3 Minor 2011-07-20 16:42:10 -04:00
Behdad Esfahbod
f6fd3780e1 Let shapers decide when to apply ccmp and locl
Instead of always applying those two features before the complex shaper,
let the complex shaper decide whether they should be applied first.

Also add stub for Indic's final_reordering().
2011-07-08 00:22:40 -04:00
Behdad Esfahbod
76f76812ac Shuffle code around, remove shape_plan from complex shapers 2011-07-07 22:25:25 -04:00
Behdad Esfahbod
d69d5ceaa0 [Indic] Well, at least finding syllables works now :)
Still not much there.
2011-07-04 12:56:38 -04:00
Behdad Esfahbod
4ec30aec30 [Indic] Optimize Indic table storage 2011-06-28 14:13:38 -04:00
Behdad Esfahbod
8fdba506f0 [Indic] Define indic_position_t 2011-06-24 20:45:55 -04:00
Behdad Esfahbod
65988a145b [Indic] Add a table of consonant positions
Copied form HarfBuzz.old Indic data.  These are below and post
consonants.  This is temporary.  Read the comment in the patch.
2011-06-24 19:05:52 -04:00
Behdad Esfahbod
c7fe56a1d5 [Indic] Some of the basic features are global; Mark them so 2011-06-24 19:05:34 -04:00
Behdad Esfahbod
867361c3ad [indic] Add syllable recognition state machine
Using an incredible tool called Ragel.
2011-06-17 18:35:46 -04:00
Behdad Esfahbod
422e08dbb8 Better categorize Indic character classes
Matches OT types now.
2011-06-15 17:22:48 -04:00
Behdad Esfahbod
b9452bfc16 Fix compiler warnings with -pedantic 2011-06-14 14:47:07 -04:00
Behdad Esfahbod
20503ccd57 More Indic data shuffling 2011-06-07 17:02:48 -04:00
Behdad Esfahbod
b9ddbd5593 [Indic] Start an Indic shaper
Nothing functional in there yet.

So far, we're parsing IndicSyllabicCategory.txt and IndicMatraCategory.txt
fils from Unicode Character Database and store them in an array to be used
by the shaper.  Also hooked up the shaper, but it does not do anything
right now.
2011-06-02 17:43:12 -04:00