Commit Graph

3471 Commits

Author SHA1 Message Date
Behdad Esfahbod
976c8f4552 New API: hb_buffer_[sg]et_replacement_codepoint()
With this change, we now by default replace broken UTF-8/16/32 bits
with U+FFFD.  This can be changed by calling new API on the buffer.
Previously the replacement value used to be (hb_codepoint_t)-1.

Note that hb_buffer_clear_contents() does NOT reset the replacement
character.

See discussion here:

6f13b6d62d

New API:

  hb_buffer_set_replacement_codepoint()
  hb_buffer_get_replacement_codepoint()
2014-07-16 15:34:20 -04:00
Behdad Esfahbod
bcba8b4502 New API hb_buffer_add_codepoints()
Like hb_buffer_add_utf32, but doesn't do any Unicode validation.
This is like what hb_buffer_add_utf32 used to be until a couple
commits ago.
2014-07-16 14:59:04 -04:00
Behdad Esfahbod
625dbf141a [buffer] Templatize UTF-* functions 2014-07-16 14:52:59 -04:00
Behdad Esfahbod
e634fed428 [buffer] Validate UTF-32 input
Same as what we do for UTF-8 and UTF-16.
2014-07-16 14:17:26 -04:00
Behdad Esfahbod
b98c5db32d Minor refactoring 2014-07-16 13:44:01 -04:00
Behdad Esfahbod
844f1a487d [tests] Add record-test.sh 2014-07-16 13:32:51 -04:00
Behdad Esfahbod
3b861421a7 Fix Mongolian Variation Selectors for fonts without GDEF
Originally we fixed those in 79d1007a50.
However, fonts like MongolianWhite don't have GDEF, but have IgnoreMarks
in their LigatureSubstitute init/etc features.  We were synthesizing a
GDEF class of mark for Mongolian Variation Selectors and as such the
ligature lookups where not matching.  Uniscribe doesn't do that.

I tried with more sophisticated fixes, like, if there is no GDEF and
a lookup-flag mismatch happens, instead of rejecting a match, try
skipping that glyph.  That surely produces some interesting behavior,
but since we don't want to support fonts missing GDEF more than we have
to, I went for this simpler fix which is to always mark
default-ignorables as base when synthesizing GDEF.

Micro-test added.

Fixes rest of https://bugs.freedesktop.org/show_bug.cgi?id=65258
2014-07-16 13:30:26 -04:00
Behdad Esfahbod
878a25375b Minor 2014-07-16 13:21:59 -04:00
Behdad Esfahbod
ec181e5014 Minor moving around 2014-07-16 13:10:03 -04:00
Behdad Esfahbod
e7ce50d9eb [indic] Fix access past end of array 2014-07-16 12:30:39 -04:00
Behdad Esfahbod
73e23b0acf Whitespace 2014-07-15 18:43:49 -04:00
Behdad Esfahbod
f27be105af [Android.mk] Actually remove static library 2014-07-11 18:15:34 -04:00
Behdad Esfahbod
96b80e9bcc [Android.mk] Remove static library, add note re how to build 2014-07-11 17:00:12 -04:00
Behdad Esfahbod
b7bc0b671d Simplify / speed up UTF-8 code 2014-07-11 16:22:13 -04:00
Behdad Esfahbod
af2490c095 Only accept well-formed UTF-8 sequences
Enable tests that were disabled before, and adjust one test,
and add more tests.
2014-07-11 16:22:13 -04:00
Behdad Esfahbod
7323d385cc Simplify hb_utf_prev<16> to call hb_utf_next<16> 2014-07-11 16:22:13 -04:00
Behdad Esfahbod
c09a607a84 Use hb_in_range() for arabic and indic tables
Though, looks like gcc was smart enough to produce the same code
before...
2014-07-11 16:22:13 -04:00
Behdad Esfahbod
7627100f42 Mark unsigned integer literals with the u suffix
Simplifies hb_in_range() calls as the type can be inferred.
The rest is obsessiveness, I admit.
2014-07-11 16:22:13 -04:00
Behdad Esfahbod
a8b89a09f6 Simplify hb_in_range()
It's both faster and produces smaller code.  Now I feel stupid for
not writing it this way before.
2014-07-11 14:18:01 -04:00
Behdad Esfahbod
db8934faa1 Simplify hb_utf_prev<8> to call hb_utf_next<8> 2014-07-11 13:58:36 -04:00
Behdad Esfahbod
efe74214bb Show U+FFFD REPLACEMENT CHARACTER for invalid Unicode codepoints
Only if the font doesn't support it.  Ie, this gives the user to
use non-Unicode codepoints as private values and return a meaningful
glyph for them.  But if it's invalid and font callback doesn't
like it, and if font has U+FFFD, show that instead.

Font functions that do not want this automatic replacement to
happen should return true from get_glyph() if unicode > 0x10FFFF.

Replaces https://github.com/behdad/harfbuzz/pull/27
2014-07-11 11:59:48 -04:00
Behdad Esfahbod
6f13b6d62d When parsing UTF-16, generate invalid codepoint for lonely low surrogate
Test passes now.
2014-07-10 19:39:39 -04:00
Behdad Esfahbod
24b2ba9dfa [test-buffer] Add test for lonely low-surrogate
Currenty fails.  Ouch!
2014-07-10 19:31:16 -04:00
Behdad Esfahbod
6334495ac1 Use zh-Hans / zh-Hant when converting OT language tag to hb_language_t 2014-07-10 19:22:07 -04:00
Behdad Esfahbod
f381e320df Fix lang matching logic
Previous code was broken logically, but harmless.
2014-07-10 19:20:35 -04:00
Behdad Esfahbod
ee5350d667 Accept BCP 47 zh-Hans / zh-Hant language tags 2014-07-10 19:18:56 -04:00
Behdad Esfahbod
4315402867 [Android.mk] Add note re static library 2014-07-10 17:37:26 -04:00
Behdad Esfahbod
5b4131eb1c [Android.mk] Update for new ICU
https://android-review.googlesource.com/#/c/100722/1/Android.mk
2014-07-09 19:09:08 -04:00
Behdad Esfahbod
ab28196c95 [Android.mk] Re-enable ICU unicode funcs 2014-07-09 18:18:06 -04:00
Behdad Esfahbod
ea001374b8 0.9.30 2014-07-09 17:41:09 -04:00
Behdad Esfahbod
8b16ff1259 [uniscribe] Fix build after recent changes to Offset 2014-07-09 17:41:09 -04:00
Behdad Esfahbod
73f7f8919e Define _POSIX_C_SOURCE only if it is not defined
Fixes https://github.com/behdad/harfbuzz/pull/45
2014-07-09 17:17:18 -04:00
Behdad Esfahbod
6bd5646f1b [tests] Remove bash'ish
Apparently on travis-ci, bash is linked to dash, which doesn't
understand "let".  Failing tests were not being noticed.  See eg:

  https://travis-ci.org/behdad/harfbuzz/jobs/29544211

Don't rely on bash.
2014-07-09 17:07:06 -04:00
Behdad Esfahbod
0afedaa96c [util/hb-shape] Fix crash; oops 2014-07-09 17:00:48 -04:00
Behdad Esfahbod
0cd94491b9 [ucdn] Update to Unicode 7.0.0 data
From http://github.com/behdad/ucdn
2014-07-09 16:53:06 -04:00
Behdad Esfahbod
9d4ede3a97 [Android.mk] Update source list 2014-07-09 16:19:55 -04:00
Behdad Esfahbod
7e1ab1f6d8 [Android.mk] Whitespace 2014-07-09 16:14:01 -04:00
Behdad Esfahbod
5c6695c424 [Android.mk] Remove -lpthread; we build with -DHB_NO_MT 2014-07-09 16:13:58 -04:00
Behdad Esfahbod
9109f1e944 [util/hb-shape] Accept an empty output-format that would skip output
Useful for benchmarking, to avoid buffer serialization overhead (which
seems to by far dominate shaping!)
2014-07-08 20:02:29 -04:00
Behdad Esfahbod
8656408572 [util] Fix hb-view rendering with --font-funcs=ot 2014-07-08 18:10:20 -04:00
Behdad Esfahbod
8650def735 [util] Add option to set font function implementation to use
Supports ft and ot right now.  hb-view currently not rendering with ot.
Will fix after some clean up.
2014-07-05 15:51:25 -04:00
Behdad Esfahbod
2306ad46dc [util] Fix memory issue 2014-07-04 18:09:29 -04:00
Behdad Esfahbod
14a4a9d649 Add Roozbeh to AUTHORS
He's been my shadow for all Indic-related changes in the last
few months.
2014-07-01 15:51:54 -04:00
Behdad Esfahbod
68f724484b [indic] Remove some more now-unused special-cases 2014-06-30 15:46:53 -04:00
Behdad Esfahbod
e79c948980 [indic] Remove special-casing of U+1CF2,1CF3
These were introduced in a498565ced,
but IndicSyllabicCategory has had the correct value already, so the
special code was never needed.
2014-06-30 15:39:39 -04:00
Behdad Esfahbod
d743ce78e1 [indic-table] Update to Unicode 7.0 data
Touch code just enough to preserve previous syllable structure
and functionality as closely as possible.  Many further cleanups
coming later.
2014-06-30 15:24:45 -04:00
Behdad Esfahbod
5fa21b3ab7 [indic-table] Fix category frequency counts in comments 2014-06-30 14:30:54 -04:00
Behdad Esfahbod
5c4e3e9a57 Whitespace 2014-06-30 14:25:18 -04:00
Behdad Esfahbod
af528b6674 Fix typo; ouch! 2014-06-27 18:07:00 -04:00
Behdad Esfahbod
7d4ada66c9 Mark unsed members with a "Z" suffix
There may be more.  There are members that are by definition
redundant or reserved and not needed, NOT what we *currently*
don't use.

I'm sure there's more...
2014-06-27 17:32:56 -04:00