Commit Graph

388 Commits

Author SHA1 Message Date
Behdad Esfahbod
f3584d3a3a Add test cases for Thai PUA shaping 2012-11-14 17:53:06 -08:00
Behdad Esfahbod
6b19fa4862 Adjust diff rule for the new hb-shape output format 2012-11-14 11:38:50 -08:00
Behdad Esfahbod
82c4d9880a Add Sinhala test case for split matra U+0DDA 2012-11-14 10:56:02 -08:00
Behdad Esfahbod
d04b128531 Fix test 2012-11-14 10:53:10 -08:00
Behdad Esfahbod
0c7df22228 Add buffer flags
New API:

	hb_buffer_flags_t

	HB_BUFFER_FLAGS_DEFAULT
	HB_BUFFER_FLAG_BOT
	HB_BUFFER_FLAG_EOT
	HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES

	hb_buffer_set_flags()
	hb_buffer_get_flags()

We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.

The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
2012-11-13 14:42:35 -08:00
Behdad Esfahbod
c8d4f8b0fe Minor 2012-11-13 14:10:19 -08:00
Behdad Esfahbod
82ecaff736 Add hb_buffer_clear()
Which is like _reset(), but does NOT clear unicode-funcs.
2012-11-13 14:10:00 -08:00
Behdad Esfahbod
de796a6fb9 Add "new" Myanmar OT Script tag
Windows 8 added support for Myanmar shaping using the "mym2" script tag,
even though Windows never supported the old "mymr" tag.
2012-11-12 17:27:51 -08:00
Behdad Esfahbod
27f52dc3f6 Add Myanmar tests from UTN#11 2012-11-12 16:54:03 -08:00
Behdad Esfahbod
e6b86c8519 Add test for non-joining Mongolian letters
For U+1880..U+1886 Uniscribe thinks they are non-joining.
For U+1887 Uniscribe thinks it's joining, but looks wrong to me.
2012-11-05 15:18:49 -08:00
Behdad Esfahbod
f5e55754f9 Add Tifinagh test data 2012-11-02 13:53:18 -07:00
Behdad Esfahbod
c21498afd8 Add Mongolian and 'Phags-pa joining test cases 2012-11-02 10:21:26 -07:00
Behdad Esfahbod
431bef2e16 Minor build fix 2012-11-01 16:26:01 -07:00
Behdad Esfahbod
911ed09698 Ignore gid0 in test results 2012-10-29 19:42:19 -07:00
Behdad Esfahbod
10b88d89ef Add Ethiopic test case
This sequence: U+120B,U+135F,U+120B with the Nyala font from Win7
exposes a GPOS bug in Uniscribe, in that the positioned mark is wrongly
moved as a result a following kern.

This is the one "failure" in the Ethiopic test suite :-).

ETHIOPIC: 118900 out of 118901 tests passed. 1 failed (0.000841036%)
2012-10-29 18:26:00 -07:00
Behdad Esfahbod
166b5cf7ec [Indic] Find syllables before any features are applied
With FreeSerif, it seems that the 'ccmp' feature does ligature
substituttions.  That was then causing syllable match failures.  We now
find syllables before any features have been applied.

Test sequence: U+0D9A,U+0DCA,U+200D,U+0DBB,U+0DCF
2012-09-07 14:56:01 -04:00
Behdad Esfahbod
efb8d3eb71 Fixup test failure reporting
After we implemented dotted-circle, we were still ignoring any tests
that had dottedcircle in it for any of the shapers.  That meant that if
we wrongly outputted dottedcircle, the test was being ignored.  Ouch!

Fixing that shows regressions across the board.  Most are Uniscribe
bugs: NOT inserting dotted-circle when it should.  Some are arou
machine bugs.  This is in fact a nice way to catch Indic-machine
deficiencies and when I fix the regressions, our clusters should be
much closer to Uniscribe.  For now, we regressed from:

BENGALI: 353997 out of 354285 tests passed. 288 failed (0.0812905%)
DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.00777502%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299106 out of 299124 tests passed. 18 failed (0.00601757%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048104 out of 1048416 tests passed. 312 failed (0.0297592%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271747 out of 271847 tests passed. 100 failed (0.0367854%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970558 out of 970573 tests passed. 15 failed (0.00154548%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)

To:

BENGALI: 353990 out of 354285 tests passed. 295 failed (0.0832663%)
DEVANAGARI: 707315 out of 707394 tests passed. 79 failed (0.0111678%)
GUJARATI: 366447 out of 366506 tests passed. 59 failed (0.016098%)
GURMUKHI: 60707 out of 60809 tests passed. 102 failed (0.167738%)
KANNADA: 951042 out of 951913 tests passed. 871 failed (0.0915%)
KHMER: 298962 out of 299124 tests passed. 162 failed (0.0541581%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048074 out of 1048416 tests passed. 342 failed (0.0326206%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091835 out of 1091837 tests passed. 2 failed (0.000183178%)
TELUGU: 970553 out of 970573 tests passed. 20 failed (0.00206064%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)

Investigating.
2012-09-05 15:57:38 -04:00
Behdad Esfahbod
a4e75e4128 Minor 2012-08-27 15:54:15 -04:00
Behdad Esfahbod
206ab60573 [test] Move around 2012-08-10 09:06:30 -04:00
Behdad Esfahbod
7a484c601e [test] Add Urdu ligature sequences from CRULP 2012-08-10 09:05:29 -04:00
Behdad Esfahbod
378d279bbf Implement Unicode compatibility decompositions
Based on patch from Philip Withnall.
https://bugs.freedesktop.org/show_bug.cgi?id=41095
2012-07-31 21:36:16 -04:00
Behdad Esfahbod
70b3dc3272 Add Hebrew test 2012-07-30 12:40:18 -04:00
Behdad Esfahbod
a973b5ce86 [GSUB] Further adjustments to mark-attachment vs ligation interaction
The d1d69ec52e change broke Kannada badly,
since it was ligating consonants, pushing matra out, and then ligating
with the matra.  Adjust for that.  See comments.
2012-07-30 01:47:46 -04:00
Behdad Esfahbod
97a201becf Add Arabic tests for mark ligature component attachments 2012-07-29 20:37:29 -04:00
Behdad Esfahbod
5d874d566f [GPOS] Fix mark-to-mark positioning when one of the marks is a ligature
This commit: a3313e5400 broke MarkMarkPos
when one of the marks itself is a ligature.  That regressed 26 Tibetan
tests (up from zero!).  Fix that.  Tibetan back to zero.
2012-07-28 21:05:25 -04:00
Behdad Esfahbod
6411e74caf [Indic] Reposition Gurmukhi top matras to after post
The font is forming a post-base consonant in some samples, and Uniscribe
positions top matra on the post-base.  Do the same.

Gurmukhi failures down from 59 to 41 (0.0674242%).
2012-07-24 13:48:49 -04:00
Behdad Esfahbod
c3f769ba09 [Indic] Ignore Uniscribe output containing two zero-width space glyphs
Uniscribe is buggy and sometimes /eats/ a mark next to a non-joiner.
Most of Malayalam failures where actually hitting this bug.

Ignore test output with two zero-width space glyphs.  This is a hack
until we build up the test suite infrastructure better.

Bengali went down by 9, Devanagari by 2, Kannada by 130, Malayalm down
from 1197 to 307, Sinhala down by 16, Telugu down by 26.  New stats:

BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%)
DEVANAGARI: 693573 out of 693628 tests passed. 55 failed (0.00792932%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1048109 out of 1048416 tests passed. 307 failed (0.0292823%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271715 out of 271847 tests passed. 132 failed (0.0485567%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970550 out of 970573 tests passed. 23 failed (0.00236973%)
2012-07-24 13:26:32 -04:00
Behdad Esfahbod
65c43accdc [Indic] Better position left-matra in Malayalam
Just put it before base, which is what's expected.

Malayalam failures down from 1559 to 1197 (0.114172%).

BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1047219 out of 1048416 tests passed. 1197 failed (0.114172%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24 03:36:47 -04:00
Behdad Esfahbod
88f413b56f [Indic] Implement Reph+Ya-Phalaa interaction
The sequence Ra,H,Ya in Bengali is ambigious and Unicode encoded that to
get Ya-Phalaa, one would place ZWJ before Halant.  Ie. a ZWJ,H sequence
requests subjoining, while a H,ZWJ requests Half form.  Implement that.

Bengali failures go down from 377 to 297 (0.0838308%).
Gujarati is down by 4 to 17 (0.0046384%).
Kannada is down by 226 to 957 (0.100534%).

Current status:

BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1046857 out of 1048416 tests passed. 1559 failed (0.148701%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24 03:04:36 -04:00
Behdad Esfahbod
330b329c89 [Indic] Unmark U+17D1 KHMER SIGN VIRIAM to NOT be a Virama
Fixes another 1 Khmer failure.  Down to 30 (0.0100293%) now.
2012-07-24 02:25:26 -04:00
Behdad Esfahbod
d90b8e841e [Indic] Reposition Khmer prebase-reordering Ra around split matras
In Khmer coeng model, a V,Ra can go *after* matras.  If it goes after a
split matra, it should be reordered to *before* the left part of such matra.

Khmer failures down from 136 to 39 (0.0130381%).
2012-07-24 02:11:18 -04:00
Behdad Esfahbod
7573799126 [Indic] Position Khmer U+17CE
Fixes another 6 Khmer failures.  Now at 136 (0.0454661%).
2012-07-24 01:32:07 -04:00
Behdad Esfahbod
2278eefcdb [Indic] In Sinhala, form forced Reph even if no other consonant found
Fixes another 10 Sinhala failures.  Down to 148 (0.0544424%).
2012-07-24 00:31:10 -04:00
Behdad Esfahbod
71fd5e80ad [Indic] Further adjust base algorithm for Sinhala
Apparently if there is C,V,ZWJ,C, the first C will be base, but if
it's C,ZWJ,V,C, the second one will be.

Note that Uniscribe implements this differently, by breaking syllable in
the case of C,ZWJ,V,C and putting the first consonant in one syllable
and the rest in the next syllable.

Sinhala failures down from 208 to 158 (0.0581209%).  No changes to
Khmer.
2012-07-24 00:21:16 -04:00
Behdad Esfahbod
73d71cc527 [Indic] End Vowel-based syllable at ZWJ
One Devanagari test regressed, plus 10 Malayalam (at 1545 now).

Fixed 120 Sinhala failures.  Now at 208 (0.0765136%).
2012-07-24 00:09:12 -04:00
Behdad Esfahbod
34c215036f [Indic] Improve Sinhala base algorithm and reph positioning
Sinhala does not have half forms.  And most (all?) consonants can be
base, except when preceded by ZWJ, which would request a subjoined form.
Hence switch the base algorithm to categorize with Khmer, start search
at start, and stop at a ZWJ.

Also, mark all pos=base consonants after base to be subjoined.  Mark
base itself to have pos=base.

Finally, adjust Sinhala's reph position to after-main.

Brings down Sinhala failures from 455 to 328 (0.120656%).
2012-07-23 23:51:29 -04:00
Behdad Esfahbod
771a8f5028 [Indic] exclude ligatures when matching on Indic category
If, say, a H,ZWJ,C ligature was formed, we don't want the code to detec
that as a Halant.  So, ignore ligatures when matching category in
final_reordering.

Sinhala failures down from 514 to 455 (0.167374%).
2012-07-23 20:09:30 -04:00
Behdad Esfahbod
42848453bf [Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU
Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39.  We do that by
modifying the ccc for U+0E3A.

Fixes the two remaining Thai failures (see previous commit).
2012-07-23 13:52:07 -04:00
Behdad Esfahbod
4a7f4f3e56 [Thai] Adjust SARA AM reordering to match Uniscribe
Adjust the list of marks before SARA AM that get the reordering
treatment.  Also adjust cluster formation to match Uniscribe.

With Wikipedia test data, now I see:

  - For Thai, with the Angsana New font from Win7, I see 54 failures out
    of over 4M tests  (0.00129107%).  Of the 54, two are legitimate
    reordering issues (fix coming soon), and the other 52 are simply
    Uniscribe using a zero-width space char instead of an unknown
    character for missing glyphs.  No idea why.  The missing-glyph
    sequences include one that is a Thai character followed by an Arabic
    Sokun.  Someone confused it with Nikhahit I assume!

  - For Lao, with the Dokchampa font from Win7, 33 tests fail out of
    54k (0.0615167%).  All seem to be insignificant mark positioning
    with two marks on a base.  Have to investigate.
2012-07-23 13:15:33 -04:00
Behdad Esfahbod
60554f14d8 [Indic] Merge in Malayalam tests
From:
http://silpa.org.in/pub/tests/hb/ml/ml-harfbuzz-testdata.txt
2012-07-22 23:23:56 -04:00
Behdad Esfahbod
5c7081770c [Indic] Add extensive Sinhala tests
Generated by:
http://git.savannah.gnu.org/cgit/sinhala.git/plain/utils/gen-unicode-sinhala.py
2012-07-22 23:20:27 -04:00
Behdad Esfahbod
2efe4707b1 [Indic] Add Sinhala tests
Merge tests from:
http://git.savannah.gnu.org/cgit/sinhala.git/plain/patches/icu-sinhala-rendering.txt
2012-07-22 23:17:59 -04:00
Behdad Esfahbod
3d4c111b7a Add a test case 2012-07-20 19:34:39 -04:00
Behdad Esfahbod
bdd080431a [Indic] Reposition Oriya Candrabindu
Oriya failures down from 0.65% to 0.20%.
2012-07-20 16:03:09 -04:00
Behdad Esfahbod
87cd63266e [Indic] Recategorize some Kannada right matras
Kannada failures down from 3.5% to 2.93%.
2012-07-19 21:25:46 -04:00
Behdad Esfahbod
c87bcddb10 [Indic] Add failing test for Kannada 2012-07-19 20:03:25 -04:00
Behdad Esfahbod
deeb540a74 [test] Ignore tests with DOTTED CIRCLE in the output 2012-07-19 11:30:48 -04:00
Behdad Esfahbod
422ecd2d3c [Indic] Accept a forced Rakar sequence at the end of syllable
In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra.  If you put that at the
end of a Consonant,Matra syllable, you get a dotted-circle from
Uniscribe.  Apparently adding a ZWJ before the Al-Lakuna "fixes" that.
And people have been encoding that sequence...  So, allow a forced
"ZWJ,Virama,ZWJ,Ra" sequence at the of syllables.

Fixes some 100 or more of Sinhala failures.  Now at 622 only (0.23%).
2012-07-18 23:25:58 -04:00
Behdad Esfahbod
10cdc94eee [Indic] In final reordering, find base, even if it disappeared
POS_BASE can disappear if base ligated backward.  Define base as last
with position not after base.

Fixes a few hundred of Sinhala failures with Iskoola Pota.
2012-07-18 17:43:23 -04:00
Behdad Esfahbod
3285e107c9 [Indic] Implement Sinhala "Al Lakuna" Reph behavior
In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.
2012-07-18 17:22:14 -04:00
Behdad Esfahbod
552d19b7a1 [Indic] Treat Register Shifters like Nukta
Really this time.

Fixes another 18 Khmer tests.
2012-07-18 16:02:33 -04:00
Behdad Esfahbod
69f26bf39c [Indic] Fix Matra reordering when base is at end of syllable
For example: U+915,U+200c,U+93f

Fixes last Tamil failure!
2012-07-18 15:47:51 -04:00
Behdad Esfahbod
391cc03317 [Indic] Allow halant group in Vowel and placeholder syllables
Fixes 2 out of 560 Devanagari failures.  AND:
Fixes 1 out of 2 Tamil failures.
2012-07-18 15:12:49 -04:00
Behdad Esfahbod
418d00dffd [Indic] Minor 2012-07-18 14:57:28 -04:00
Behdad Esfahbod
25bc489498 [Indic] Better categorize Register Shifters and Khmer Various signs
Down another 500 or so Khmer failures!
2012-07-17 17:53:03 -04:00
Behdad Esfahbod
34b5714906 [Indic] Treat Khmer Register Shifters more like Nuktas
Except that there may be a ZWNJ before a Register Shifter.
2012-07-17 14:09:32 -04:00
Behdad Esfahbod
0201e0a464 [Indic] Apply 'cfar' for Khmer
Mark stuff after a pre-base reordering Ro 'cfar'.  Used in Khmer.
This allows distinguishing the following cases with MS Khmer fonts:

  U+1784,U+17D2,U+179A,U+17D2,U+1782
  U+1784,U+17D2,U+1782,U+17D2,U+179A
2012-07-17 13:56:24 -04:00
Behdad Esfahbod
55f70ebfb9 [Indic] Position final subjoined consonants (and vowels) after matras
In Khmer, a final subjoined consonant or independent vowel can occur
after matras.  This final subjoined thing should NOT be reordered to
before the matra even though it's subjoined.

Fixes another 1k of the Khmer failures.  Not much left really.
2012-07-17 12:50:13 -04:00
Behdad Esfahbod
c50ed71e9a [Indic] Recategorize Khmer coeng sign as a separate category OT_Coeng
Amend the syllable structure to allow a final subscripted consonant
(Coeng+C) and a final subscripted independent vowel (Coeng+V).
Fixes another 2k of Khmer failures.
2012-07-17 11:54:28 -04:00
Behdad Esfahbod
74ccc6a132 [Indic] Move Halant with after-base consonants
Normally, we attach the Halant to the previous character and move it
with it.  For after-base consonants however, the Halant "belongs" to the
consonant after, so attach it so.

This fixes Bengali sequences involving post-base consonant Ya, which
should ligate with the Halant to form Ya Phala, but previously a
reordered matras was blocking the ligation.
2012-07-17 11:16:19 -04:00
Behdad Esfahbod
d5c4edcdd6 [Indic] Apply presentation-forms features all at once
Seems like this is what Uniscribe is doing, and does not break any fonts
we tested (with Devanagari, Malayalam, Khmer, and Bengali), while fixing
some Ra Phala sequences for Bengali with Vrinda.  Fixes another 2% of
Bengali failures (a couple more to go).
2012-07-17 10:40:59 -04:00
Behdad Esfahbod
6de103547e [test/arabic] Add Arabic tests for mark skipping
Expose a bug with Khaled's Hussaini Nastaleeq font.
2012-07-16 22:46:52 -04:00
Behdad Esfahbod
1167c7bfc9 Minor 2012-07-11 18:00:28 -04:00
Behdad Esfahbod
aa116582e6 Minor 2012-07-11 18:00:28 -04:00
Behdad Esfahbod
5e113a4b79 g_thread_init() is deprecated 2012-06-16 15:26:13 -04:00
Behdad Esfahbod
a18280a8ce Fix warnings produced by clang analyzer 2012-06-07 15:44:12 -04:00
Behdad Esfahbod
b0a6e58bb3 s/script-punjabi/script-gurmukhi/ 2012-06-04 10:21:22 -04:00
Behdad Esfahbod
4efdffec09 Minor Malayalam test case
From https://bugs.freedesktop.org/show_bug.cgi?id=45166
2012-05-28 10:45:50 -04:00
Behdad Esfahbod
dfff5b3021 Add Myanmar test case 2012-05-28 10:45:50 -04:00
Behdad Esfahbod
ff3524c21a Add Arabic diacritics tests 2012-05-23 21:50:43 -04:00
Behdad Esfahbod
a6de53664d Add CJK Compatibility Ideographs tests
From:
http://people.mozilla.org/~jdaggett/tests/cjkcompat.html
2012-05-18 15:04:35 -04:00
Behdad Esfahbod
f538fcb538 [test] Make tool usage easier by not requiring "--stdin"
Just default to it.  Added "--help" instead to get usage.
2012-05-12 15:34:40 +02:00
Behdad Esfahbod
a3273e30bb [Indic] Add more Malayalam tests 2012-05-12 13:34:18 +02:00
Behdad Esfahbod
5b16de97bc [Indic] Add tests for dottedcircle 2012-05-11 19:55:42 +02:00
Behdad Esfahbod
c071b99f15 [Indic] Add test for Left Matra with Halant
Uniscribe doesn't move the Halant, we do.  And do a broken job of it now.
2012-05-11 16:22:46 +02:00
Behdad Esfahbod
b20c9ebaf5 [Indic] Add test for matra group
The spec says: "[{M}+[N]+[H]]", and that's what Uniscribe implements.
We instead do: "{M+[N]+[H]}", which means we allow Nukta and Halant
after all Matras, not just the last one.  It makes more sense.
2012-05-10 18:31:17 +02:00
Behdad Esfahbod
61a58e26a5 [Indic] Add tricky reordering test cases
In the case of Consonant,LeftMatra,Halant, Uniscribe leaves the Halant
where it is, but we want to move it with the Matra as that makes more
logical sense.
2012-05-10 14:43:53 +02:00
Behdad Esfahbod
3943293a99 [Indic] Add joiner test cases for Devanagari 2012-05-09 15:27:56 +02:00
Behdad Esfahbod
2214a03900 Add hb-diff-ngrams 2012-05-09 09:54:54 +02:00
Behdad Esfahbod
178e6dce01 Add N-gram generator 2012-05-09 08:57:29 +02:00
Behdad Esfahbod
98669ceb77 Use groupby() 2012-05-09 08:16:15 +02:00
Behdad Esfahbod
c438a14b62 Add hb-diff-stat 2012-05-09 07:45:17 +02:00
Behdad Esfahbod
1058d031e2 Make hb-diff-filter-failtures retain all test info for failed tests 2012-05-09 07:35:28 +02:00
Behdad Esfahbod
f1eb008cc7 Add hb-diff-colorize
Accepts --format=html now.
2012-05-09 00:01:50 +02:00
Behdad Esfahbod
9155e4ffe0 Cleanup diff
Doesn't do --color anymore.  That will go into a new hb-diff-colorize
tool.
2012-05-08 22:44:21 +02:00
Behdad Esfahbod
7d22135b4c Make hb-diff faster 2012-05-08 19:38:49 +02:00
Behdad Esfahbod
a93e238e05 More tests 2012-05-08 18:55:29 +02:00
Behdad Esfahbod
585b107cde Add test caes for a minority language using Bengali
U+0985 BENGALI LETTER A followed by U+09D7 BENGALI AU LENGTH MARK.
According to Bobby de Vos on the mailing list, this results in a dotted
circle with most shaping engines, but is a legitimate sequence in this
minority language.

We reached the consensus on the list to NOT implement dotted-circle
in HarfBuzz.
2012-04-24 16:00:50 -04:00
Behdad Esfahbod
0290bbf861 Add another Thai test 2012-04-17 10:28:21 -04:00
Behdad Esfahbod
4d85252bda Add Japanese test data from Adobe's Kazuraki font ligatures 2012-04-16 15:54:26 -04:00
Behdad Esfahbod
fe28b997fb Add HB_DIRECTION_IS_VALID 2012-04-14 19:19:26 -04:00
Behdad Esfahbod
4bf90f6483 Make HB_DIRECTION_INVALID be zero
This changes all the HB_DIRECTION_* enum member values, but is
nicer, in preparation for making hb_segment_properties_t public.
2012-04-12 17:38:23 -04:00
Behdad Esfahbod
f9746b600a Minor 2012-04-12 09:59:26 -04:00
Behdad Esfahbod
7470b0ff80 Add Mongolian test case 2012-04-12 09:44:27 -04:00
Behdad Esfahbod
a4976447cd Add Hangul test 2012-04-11 17:48:40 -04:00
Behdad Esfahbod
e95d912b3b Fix diff tool 2012-04-11 17:33:02 -04:00
Behdad Esfahbod
e099dd6592 Add Thai test case for SARA AM decomposition 2012-04-10 10:47:33 -04:00
Behdad Esfahbod
4450dc9354 Move around 2012-04-07 22:07:23 -04:00
Behdad Esfahbod
aaa25d5f45 Add Hangul test case
Composed, and decomposed, of the same text.
2012-04-05 17:27:23 -04:00
Behdad Esfahbod
406044986a Add Hebrew diacritics test cases
From:
https://bugzilla.mozilla.org/show_bug.cgi?id=662055
2012-03-06 20:24:31 -05:00
Behdad Esfahbod
7a70ca78e0 Add test case from https://bugzilla.mozilla.org/show_bug.cgi?id=714067 2012-02-21 11:31:47 -05:00
Behdad Esfahbod
1a5a91dc0d Add a few more tests 2012-01-22 19:58:23 -05:00
Behdad Esfahbod
1795f3a222 Add a couple Thai test cases from Thep 2012-01-22 19:29:45 -05:00
Behdad Esfahbod
ec3f506682 Add Devanagari test from Tom Hacohen 2012-01-22 19:10:55 -05:00
Behdad Esfahbod
71be4ca3dd Also ignore "ChangeLog" in manifests 2012-01-22 16:26:49 -05:00
Behdad Esfahbod
3c9a39ecd6 Remove newline 2012-01-22 16:21:19 -05:00
Behdad Esfahbod
e4ccbfe276 Allow --color=html in hb-diff
Not that useful right now as we don't escape < and >.  Perhaps
another tool can be added to convert the ANSI output to HTML.
2012-01-22 16:07:32 -05:00
Behdad Esfahbod
8f80f93491 More shoveling around 2012-01-21 20:03:25 -05:00
Behdad Esfahbod
c78c6e9844 Cleanup 2012-01-21 19:55:16 -05:00
Behdad Esfahbod
ab94a9c542 Distribute testing tools 2012-01-21 19:43:58 -05:00
Behdad Esfahbod
3e86feb54c Speed up colorless diff 2012-01-21 19:40:30 -05:00
Behdad Esfahbod
1e58df6034 Cleanup manifest code 2012-01-21 19:37:31 -05:00
Behdad Esfahbod
956d552e10 Port hb-manifest-update to Python 2012-01-21 19:31:51 -05:00
Behdad Esfahbod
3a34e9e351 Ignore Broken Pipe errors 2012-01-21 19:15:41 -05:00
Behdad Esfahbod
f22089ac24 Misc fixes 2012-01-20 21:22:14 -05:00
Behdad Esfahbod
96968bfae5 Port hb-manifest-read to Python 2012-01-20 21:16:34 -05:00
Behdad Esfahbod
a59ed46fa4 Add final residues from test-shape-complex 2012-01-20 20:56:32 -05:00
Behdad Esfahbod
820e0ed318 Add Punjabi tests from test-shape-complex also 2012-01-20 20:51:52 -05:00
Behdad Esfahbod
a7d71c1057 Add Tamil test data from Muguntharaj Subramanian 2012-01-20 20:50:09 -05:00
Behdad Esfahbod
5992a9941e Import test data from late test-shape-complex 2012-01-20 20:48:14 -05:00
Behdad Esfahbod
46ac456477 Fix Unicode encoding issue 2012-01-20 19:32:17 -05:00
Behdad Esfahbod
ad34e39a4a Make test tools interactive
By bypassing readlines() buffering.
2012-01-20 18:40:25 -05:00
Behdad Esfahbod
91540a7d97 Move most testing logic into hb_test_tools.py
The actual utils are one-liners now.
2012-01-20 18:28:10 -05:00
Behdad Esfahbod
66aa080033 Remove test-shape-complex
New shaping testsuite and framework coming.
2012-01-20 17:36:10 -05:00
Behdad Esfahbod
ed459bfb63 Add hb-unicode-encode 2012-01-20 17:24:05 -05:00
Behdad Esfahbod
b12c4d4361 Add hb-diff-filter-failures 2012-01-20 17:17:44 -05:00
Behdad Esfahbod
d4bffbc55b Move 2012-01-20 17:16:35 -05:00
Behdad Esfahbod
45f640c98d Minor 2012-01-20 14:24:21 -05:00
Behdad Esfahbod
47ca766a9c Minor 2012-01-20 14:21:53 -05:00
Behdad Esfahbod
8f1db07894 [test/shaping] Add some Indic test data for the new test suite
Imported from UTRRS.
2012-01-20 14:00:44 -05:00
Behdad Esfahbod
11267aef36 Fix 2012-01-20 13:57:14 -05:00
Behdad Esfahbod
4e84ce48d5 Move hb-diff to test/shaping/ 2012-01-20 13:51:22 -05:00
Behdad Esfahbod
f868e1b84d Add hb-unicode-decode 2012-01-20 13:50:05 -05:00
Behdad Esfahbod
9ab23ef474 Minor 2012-01-20 13:49:56 -05:00
Behdad Esfahbod
c8d81db033 Recognize more characters 2012-01-20 13:39:27 -05:00
Behdad Esfahbod
0016d4662d [test] Make hb-unicode-prettyname take a --stdin option 2012-01-20 13:31:59 -05:00
Behdad Esfahbod
ad8c6446f2 [test/shaping] Add hb-unicode-prettyname 2012-01-20 13:27:40 -05:00
Behdad Esfahbod
e900869b0f [test/shaping] Add hb-read-manifest 2012-01-19 20:28:15 -05:00
Behdad Esfahbod
a211cd3ffc Ignore AUTHORS also 2012-01-19 20:27:53 -05:00
Behdad Esfahbod
36fe87d1b4 More Indic tests from Pravin 2012-01-19 16:55:26 -05:00
Behdad Esfahbod
a33e46cf7d [test/shaping] Add hb-update-manifests 2012-01-19 15:44:55 -05:00
Behdad Esfahbod
d4de562adf Start adding new shaping test suite together 2012-01-19 15:21:04 -05:00
Behdad Esfahbod
4d6dafd47f Rename test/ to test/api/ 2012-01-19 14:52:02 -05:00
Behdad Esfahbod
8d2781d692 [test] Add two Indic test cases from Bernard Massot 2012-01-19 11:36:39 -05:00
Behdad Esfahbod
055fb24d03 Add test for bug in ICU decompose
As reported by Kenichi Ishibashi on 2011-10-28.
2012-01-18 22:11:31 -05:00
Behdad Esfahbod
a17554bfd5 Make test-c.c actually use hb
This will make sure we test that C code can actually link to the
library.
2011-09-28 16:57:34 -04:00
Behdad Esfahbod
738d096a06 Pass through unknown ISO 639-3 language tags to OpenType engine
In hb_ot_tag_from_language(), if first component of an unknown
language is three letters long, use it directly as OpenType language
tag (after case conversion and padding).
2011-09-02 13:31:19 -04:00
Behdad Esfahbod
4c9fe88d30 [API] Make all _from_string() functions take a len parameter
Can be -1 for NUL-terminated string.  This is useful for passing parts
of a larger string to a function without having to copy or modify the
string first.

Affected functions:

	hb_tag_t hb_tag_from_string()
	hb_direction_from_string()
	hb_language_from_string()
	hb_script_from_string()
2011-08-26 09:22:12 +02:00
Behdad Esfahbod
e6c09cdf43 Remove the pre_allocate argument from hb_buffer_create()
For two reasons:

1. User can always call hb_buffer_pre_allocate() themselves, and

2. Now we do a pre_alloc in add_utfX anyway, so the total number of
reallocs is limited to a small number (~3) anyway.  This just makes the
API cleaner.
2011-08-19 19:20:26 +02:00
Behdad Esfahbod
217cc81cd9 [test/shape-complex] Print cluster and position info in --verbose 2011-08-09 14:00:44 +02:00