Commit Graph

115 Commits

Author SHA1 Message Date
Behdad Esfahbod
f9edf16725 Add buffer serialization / deserialization API
Two output formats for now: TEXT, and JSON.  For example:

  hb-shape --output-format=json

Deserialization API is added, but not implemented yet.
2012-11-15 13:10:07 -08:00
Behdad Esfahbod
66ac2ff32e API change: Remove "mask" from hb_buffer_add()
I don't expect anybody using hb_buffer_add(), so this shouldn't break
anyone's code.
2012-11-13 16:26:32 -08:00
Behdad Esfahbod
0c7df22228 Add buffer flags
New API:

	hb_buffer_flags_t

	HB_BUFFER_FLAGS_DEFAULT
	HB_BUFFER_FLAG_BOT
	HB_BUFFER_FLAG_EOT
	HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES

	hb_buffer_set_flags()
	hb_buffer_get_flags()

We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.

The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
2012-11-13 14:42:35 -08:00
Behdad Esfahbod
82ecaff736 Add hb_buffer_clear()
Which is like _reset(), but does NOT clear unicode-funcs.
2012-11-13 14:10:00 -08:00
Behdad Esfahbod
da70111ab2 Don't clear buffer pre-context if no new context is being provided
Patch from Jonathan Kew.

Part of fixing:

Mozilla Bug 801410 - avoid inserting dotted-circle for run-initial
Unicode combining characters in "simple" scripts such as Latin

https://bugzilla.mozilla.org/show_bug.cgi?id=801410
2012-10-31 13:45:30 -07:00
Behdad Esfahbod
0bc7a38463 [OT] Fix ReverseChainingSubst
We should make it clear that we don't want output buffer in this case,
otherwise buffer->backtrack_len() would be wrong.
2012-10-29 22:02:45 -07:00
Behdad Esfahbod
38b015e57f Fix hb_buffer_set_length(buffer, 0)
Was causing invalid realloc()s.
2012-10-28 20:11:47 -07:00
Behdad Esfahbod
05207a79e0 [buffer] Save pre/post textual context
To be used for a variety of purposes.  We save up to five characters
in each direction.  No public API changes, everything is taken care
of already.  All clients need to do is to call hb_buffer_add_utf* with
the full text + segment info (or at least some context) instead of
just passing in the segment.

Various operations (hb_buffer_reset, hb_buffer_set_length,
hb_buffer_add*) automatically reset the relevant contexts.
2012-09-25 21:32:21 -04:00
Behdad Esfahbod
1f66c3c1a0 Add hb_utf_strlen()
Speeds up UTF-8 parsing by calling strlen().
2012-09-25 11:42:16 -04:00
Behdad Esfahbod
7f19ae7b9f [buffer] Templatize UTF handling
Also move UTF routines into a separate file, to be reused from shapers
that need it.
2012-09-25 11:23:55 -04:00
Behdad Esfahbod
0e0a4da9b7 [buffer] Towards template'izing different UTF adders 2012-09-25 11:09:04 -04:00
Behdad Esfahbod
7d37280600 Minor 2012-09-25 11:04:41 -04:00
Behdad Esfahbod
96fdc04e5c Add hb_buffer_[sg]et_content_type
And hb_buffer_content_type_t and enum values.
2012-09-06 22:30:53 -04:00
Behdad Esfahbod
b85800f9de [Indic] Implement dotted-circle insertion for broken clusters
No panic, we reeally insert dotted circle when it's absolutely broken.

Fixes most of the dotted-circle cases against Uniscribe. (for Devanagari
fixes 80% of them, for Khmer 70%; the rest look like Uniscribe being
really bogus...)

I had to make a decision.  Apparently Uniscribe adds one dotted circle
to each broken character.  I tried that, but that goes wrong easily with
split matras.  So I made it add only one dotted circle to an entire
broken syllable tail.  As in: "if there was a dotted circle here, this
would have formed a correct cluster."  That works better for split
stuff, and I like it more.
2012-08-31 19:18:20 -04:00
Behdad Esfahbod
1be368e96f Minor 2012-08-31 16:29:17 -04:00
Behdad Esfahbod
965c280de0 Add HB_BUFFER_ASSERT_VAR
To be used in places we access buffer vars...
2012-08-29 14:02:37 -04:00
Behdad Esfahbod
d5045a5f40 [ICU] Use new normalizer2 compose/decompose API
It's considerably faster than the fallback implementation we had
previously!
2012-08-11 21:27:15 -04:00
Behdad Esfahbod
208f70f055 Inline Unicode callbacks internally 2012-08-01 17:13:10 -04:00
Behdad Esfahbod
69cc492dc1 [buffer] Minor 2012-07-31 14:51:36 -04:00
Behdad Esfahbod
ea278d3895 Partially switch ot shaper to shape_plan 2012-07-27 02:12:28 -04:00
Behdad Esfahbod
47ef931f13 [buffer] Make sure out_info = info during GPOS 2012-07-19 20:52:44 -04:00
Behdad Esfahbod
39b17837b4 Add hb_buffer_normalize_glyphs() and hb-shape --normalize-glyphs
This reorders glyphs within the cluster to a nominal order.  This should
have no visible effect on the output, but helps with testing, for
getting the same hb-shape output for visually-equal glyphs for each
cluster.
2012-07-17 17:09:29 -04:00
Behdad Esfahbod
e085fcf7ca Remove unused buffer->replace_glyphs_be16 2012-06-08 21:45:00 -04:00
Behdad Esfahbod
fe3dabc08d Minor 2012-06-08 20:56:05 -04:00
Behdad Esfahbod
e88e14421a Use merge_clusters instead of open-coding 2012-06-08 20:55:21 -04:00
Behdad Esfahbod
e51d2b6ed1 Extend into main buffer if extension hit end of out-buffer merging clusters 2012-06-08 20:36:33 -04:00
Behdad Esfahbod
5ced012d9f Extend end when merging clusters in out-buffer 2012-06-08 20:31:32 -04:00
Behdad Esfahbod
72c0a18783 Extend clusters backward in out-buffer 2012-06-08 20:30:03 -04:00
Behdad Esfahbod
cd5891493d Extend clusters backwards, into the out-buffer too 2012-06-08 20:28:59 -04:00
Behdad Esfahbod
cafa6f3727 When merging clusters, extend the end 2012-06-08 20:17:10 -04:00
Behdad Esfahbod
2a3d911fe0 Fix alignment-requirement missmatch
Detected by clang and lots of cmdline options.
2012-06-07 17:31:46 -04:00
Behdad Esfahbod
0594a24484 Cleanup TRUE/FALSE vs true/false 2012-06-05 20:35:40 -04:00
Behdad Esfahbod
e1ac38f8dd Fix inert buffer set_length() with zero
Oops!
2012-06-05 20:31:49 -04:00
Behdad Esfahbod
be4560a3b5 Undo default unicode-funcs to avoid static initializer again 2012-06-05 18:43:57 -04:00
Behdad Esfahbod
f06ab8a426 Better hide nil objects and make them const 2012-06-05 14:49:14 -04:00
Behdad Esfahbod
8e3715f8a1 Minor 2012-04-23 22:18:54 -04:00
Behdad Esfahbod
3b26f96ebe Add Thai shaper that does SARA AM decomposition / reordering
That's not in the OpenType spec, but it's what MS and Adobe do.
2012-04-10 10:52:07 -04:00
Behdad Esfahbod
d4cc44716c Move code around, in prep for Thai/Lao shaper 2012-04-07 21:52:28 -04:00
Behdad Esfahbod
c521e793bd Fix OOB in replace_glyph()
Patch from Kenichi Ishibashi.
2012-01-18 21:51:05 -05:00
Behdad Esfahbod
9ebe8c0286 Add buffer->replace_glyphs() 2011-08-26 09:29:42 +02:00
Behdad Esfahbod
e6c09cdf43 Remove the pre_allocate argument from hb_buffer_create()
For two reasons:

1. User can always call hb_buffer_pre_allocate() themselves, and

2. Now we do a pre_alloc in add_utfX anyway, so the total number of
reallocs is limited to a small number (~3) anyway.  This just makes the
API cleaner.
2011-08-19 19:20:26 +02:00
Behdad Esfahbod
4e9ff1dd6e Pre-allocate buffers when adding string
We do a conservative estimate of the number of characters, but still,
this limits the number of buffer reallocs to a small constant.
2011-08-15 16:21:22 +02:00
Behdad Esfahbod
33ccc77902 [API] Make set_user_data() functions take a replace parameter
We need this to set data on objects safely without worrying that some
other thread unsets it by setting it at the same time.
2011-08-09 00:43:24 +02:00
Behdad Esfahbod
944b2ba1ce [buffer] Make API take signed int length
Since we already switched to accepting -1 as 'zero-terminated'.
2011-08-09 00:23:58 +02:00
Behdad Esfahbod
144cd49a0e [buffer] Accept -1 for text_length and item_length
A -1 text_length means: zero-terminated string.
A -1 item_length means: to the end of string.
2011-08-07 00:51:50 -04:00
Behdad Esfahbod
02aeca985b [API] Changes to main shape API
hb_shape() now accepts a shaper_options and a shaper_list argument.
Both can be set to NULL to emulate previous API.  And in most situations
they are expected to be set to NULL.

hb_shape() also returns a boolean for now.  If shaper_list is NULL, the
return value can be ignored.

shaper_options is ignored for now, but otherwise it should be a
NULL-terminated list of strings.

shaper_list is a NULL-terminated list of strings.  Currently recognized
strings are "ot" for native OpenType Layout implementation, "uniscribe"
for the Uniscribe backend, and "fallback" for the non-complex backend
(that will be implemented shortly).  The fallback backend never fails.

The env var HB_SHAPER_LIST is also parsed and honored.  It's a
colon-separated list of shaper names.  The fallback shaper is invoked if
none of the env-listed shapers succeed.

New API hb_buffer_guess_properties() added.
2011-08-04 22:38:09 -04:00
Behdad Esfahbod
c605bbbb6d Remove C++ guards from source files
Where causing issues for people with MSVC.
2011-08-04 20:00:53 -04:00
Behdad Esfahbod
e62df43649 Add internal hb_buffer_t::get_scratch_buffer() 2011-08-03 17:38:54 -04:00
Behdad Esfahbod
b65c06025d Formalize buffer var allocations 2011-07-28 16:49:29 -04:00
Behdad Esfahbod
a9ad3d3460 Move more code around
Buffer var allocation coming into shape
2011-07-28 15:42:18 -04:00