Usermanual: small updates.
This commit is contained in:
parent
26c5b54fb0
commit
ed13caddf2
@ -15,14 +15,15 @@
|
||||
<section id="creating-and-destroying-buffers">
|
||||
<title>Creating and destroying buffers</title>
|
||||
<para>
|
||||
As we saw in our initial example, a buffer is created and
|
||||
As we saw in our <emphasis>Getting Started</emphasis> example, a
|
||||
buffer is created and
|
||||
initialized with <literal>hb_buffer_create()</literal>. This
|
||||
produces a new, empty buffer object, instantiated with some
|
||||
default values and ready to accept your Unicode strings.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz manages the memory of objects that it creates (such as
|
||||
buffers), so you don't have to. When you have finished working on
|
||||
HarfBuzz manages the memory of objects (such as buffers) that it
|
||||
creates, so you don't have to. When you have finished working on
|
||||
a buffer, you can call <literal>hb_buffer_destroy()</literal>:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
|
@ -6,25 +6,41 @@
|
||||
]>
|
||||
<chapter id="clusters">
|
||||
<title>Clusters</title>
|
||||
<section id="clusters">
|
||||
<title>Clusters</title>
|
||||
<section id="clusters-and-shaping">
|
||||
<title>Clusters and shaping</title>
|
||||
<para>
|
||||
In text shaping, a <emphasis>cluster</emphasis> is a sequence of
|
||||
characters that needs to be treated as a single, indivisible
|
||||
unit.
|
||||
unit. A single letter or symbol can be a cluster of its
|
||||
own. Other clusters correspond to longer subsequences of the
|
||||
input code points — such as a ligature or conjunct form
|
||||
— and require the shaper to ensure that the cluster is not
|
||||
broken during the shaping process.
|
||||
</para>
|
||||
<para>
|
||||
A cluster is distinct from a <emphasis>grapheme</emphasis>,
|
||||
which is the smallest unit of a writing system or script,
|
||||
because clusters are only relevant for script shaping and the
|
||||
layout of glyphs.
|
||||
which is the smallest unit of meaning in a writing system or
|
||||
script.
|
||||
</para>
|
||||
<para>
|
||||
For example, a grapheme may be a letter, a number, a logogram,
|
||||
or a symbol. When two letters form a ligature, however, they
|
||||
combine into a single glyph. They are therefore part of the same
|
||||
cluster and are treated as a unit — even though the two
|
||||
original, underlying letters are separate graphemes.
|
||||
The definitions of the two terms are similar. However, clusters
|
||||
are only relevant for script shaping and glyph layout. In
|
||||
contrast, graphemes are a property of the underlying script, and
|
||||
are of interest when client programs implement orthographic
|
||||
or linguistic functionality.
|
||||
</para>
|
||||
<para>
|
||||
For example, two individual letters are often two separate
|
||||
graphemes. When two letters form a ligature, however, they
|
||||
combine into a single glyph. They are then part of the same
|
||||
cluster and are treated as a unit by the shaping engine —
|
||||
even though the two original, underlying letters remain separate
|
||||
graphemes.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz is concerned with clusters, <emphasis>not</emphasis>
|
||||
with graphemes — although client programs using HarfBuzz
|
||||
may still care about graphemes for other reasons from time to time.
|
||||
</para>
|
||||
<para>
|
||||
During the shaping process, there are several shaping operations
|
||||
@ -32,14 +48,15 @@
|
||||
points form a ligature or a conjunct form and are replaced by a
|
||||
single glyph) or split one character into several (for example,
|
||||
when decomposing a code point through the
|
||||
<literal>ccmp</literal> feature).
|
||||
<literal>ccmp</literal> feature). Operations like these alter
|
||||
clusters; HarfBuzz tracks the changes to ensure that no clusters
|
||||
get lost or broken during shaping.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz tracks clusters independently from how these
|
||||
shaping operations affect the individual glyphs that comprise the
|
||||
output HarfBuzz returns in a buffer. Consequently,
|
||||
a client program using HarfBuzz can utilize the cluster
|
||||
information to implement features such as:
|
||||
HarfBuzz records cluster information independently from how
|
||||
shaping operations affect the individual glyphs returned in an
|
||||
output buffer. Consequently, a client program using HarfBuzz can
|
||||
utilize the cluster information to implement features such as:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
@ -77,11 +94,14 @@
|
||||
<para>
|
||||
Performing line-breaking, justification, and other
|
||||
line-level or paragraph-level operations that must be done
|
||||
after shaping is complete, but which require character-level
|
||||
properties.
|
||||
after shaping is complete, but which require examining
|
||||
character-level properties.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</section>
|
||||
<section id="working-with-harfbuzz-clusters">
|
||||
<title>Working with HarfBuzz clusters</title>
|
||||
<para>
|
||||
When you add text to a HarfBuzz buffer, each code point must be
|
||||
assigned a <emphasis>cluster value</emphasis>.
|
||||
@ -94,7 +114,65 @@
|
||||
value does not matter.
|
||||
</para>
|
||||
<para>
|
||||
Client programs can choose how HarfBuzz handles clusters during
|
||||
Some of the shaping operations performed by HarfBuzz —
|
||||
such as reordering, composition, decomposition, and substitution
|
||||
— may alter the cluster values of some characters. The
|
||||
final cluster values in the buffer at the end of the shaping
|
||||
process will indicate to client programs which subsequences of
|
||||
glyphs represent a cluster and, therefore, must not be
|
||||
separated.
|
||||
</para>
|
||||
<para>
|
||||
In addition, client programs can query the final cluster values
|
||||
to discern other potentially important information about the
|
||||
glyphs in the output buffer (such as whether or not a ligature
|
||||
was formed).
|
||||
</para>
|
||||
<para>
|
||||
For example, if the initial sequence of cluster values was:
|
||||
</para>
|
||||
<programlisting>
|
||||
0,1,2,3,4
|
||||
</programlisting>
|
||||
<para>
|
||||
and the final sequence of cluster values is:
|
||||
</para>
|
||||
<programlisting>
|
||||
0,0,3,3
|
||||
</programlisting>
|
||||
<para>
|
||||
then there are two clusters in the output buffer: the first
|
||||
cluster includes the first two glyphs, and the second cluster
|
||||
includes the third and fourth glyphs. It is also evident that a
|
||||
ligature or conjunct has been formed, because there are fewer
|
||||
glyphs in the output buffer (four) than there were code points
|
||||
in the input buffer (five).
|
||||
</para>
|
||||
<para>
|
||||
Although client programs using HarfBuzz are free to assign
|
||||
initial cluster values in any manner they choose to, HarfBuzz
|
||||
does offer some useful guarantees if the cluster values are
|
||||
assigned in a monotonic (either non-decreasing or non-increasing)
|
||||
order.
|
||||
</para>
|
||||
<para>
|
||||
For left-to-right scripts (LTR) and top-to-bottom scripts (TTB),
|
||||
HarfBuzz will preserve the monotonic property: client programs
|
||||
are guaranteed that monotonically increasing initial clulster
|
||||
values will be returned as monotonically increasing final
|
||||
cluster values.
|
||||
</para>
|
||||
<para>
|
||||
For right-to-left scripts (RTL) and bottom-to-top scripts (BTT),
|
||||
the directionality of the buffer itself is reversed for final
|
||||
output as a matter of design. Therefore, HarfBuzz inverts the
|
||||
monotonic property: client programs are guaranteed that
|
||||
monotonically increasing initial clulster values will be
|
||||
returned as monotonically <emphasis>decreasing</emphasis> final
|
||||
cluster values.
|
||||
</para>
|
||||
<para>
|
||||
Client programs can adjust how HarfBuzz handles clusters during
|
||||
shaping by setting the
|
||||
<literal>cluster_level</literal> of the
|
||||
buffer. HarfBuzz offers three <emphasis>levels</emphasis> of
|
||||
@ -179,7 +257,7 @@
|
||||
assign initial cluster values in a buffer by reusing the indices
|
||||
of the code points in the input text. This gives a sequence of
|
||||
cluster values that is monotonically increasing (for example,
|
||||
0,1,2,3,4,5).
|
||||
0,1,2,3,4).
|
||||
</para>
|
||||
<para>
|
||||
It is not <emphasis>required</emphasis> that the cluster values
|
||||
@ -233,16 +311,44 @@
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
</section>
|
||||
|
||||
<section id="a-clustering-example-for-levels-0-and-1">
|
||||
<title>A clustering example for levels 0 and 1</title>
|
||||
<para>
|
||||
The guarantees and benefits of level 0 and level 1 can be seen
|
||||
with some examples. First, let us examine what happens with cluster
|
||||
values when shaping involves cluster merging with ligatures and
|
||||
decomposition.
|
||||
The basic shaping operations affect clusters in a predictable
|
||||
manner when using level 0 or level 1:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
When two or more clusters <emphasis>merge</emphasis>, the
|
||||
resulting merged cluster takes as its cluster value the
|
||||
<emphasis>minimum</emphasis> of the incoming cluster values.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
When a cluster <emphasis>decomposes</emphasis>, all of the
|
||||
resulting child clusters inherit as their cluster value the
|
||||
cluster value of the parent cluster.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
When a character is <emphasis>reordered</emphasis>, the
|
||||
reordered character and all clusters that the character
|
||||
moves past as part of the reordering are merged into one cluster.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
The functionality, guarantees, and benefits of level 0 and level
|
||||
1 behavior can be seen with some examples. First, let us examine
|
||||
what happens with cluster values when shaping involves cluster
|
||||
merging with ligatures and decomposition.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Let's say we start with the following character sequence (top row) and
|
||||
initial cluster values (bottom row):
|
||||
@ -279,8 +385,8 @@
|
||||
<para>
|
||||
Next, let us say that the <literal>BC</literal> ligature glyph
|
||||
decomposes into three components, and <literal>D</literal> also
|
||||
decomposes into two components. These components each inherit the
|
||||
cluster value of their parent:
|
||||
decomposes into two components. Whenever a cluster decomposes,
|
||||
its components each inherit the cluster value of their parent:
|
||||
</para>
|
||||
<programlisting>
|
||||
A,BC0,BC1,BC2,D0,D1,E
|
||||
@ -295,6 +401,12 @@
|
||||
A,BC0,BC1,BC2D0,D1,E
|
||||
0,1 ,1 ,1 ,1 ,4
|
||||
</programlisting>
|
||||
<para>
|
||||
Note that the entirety of cluster 3 merges into cluster 1, not
|
||||
just the <literal>D0</literal> glyph. This reflects the fact
|
||||
that the cluster <emphasis>must</emphasis> be treated as an
|
||||
indivisible unit.
|
||||
</para>
|
||||
<para>
|
||||
At this point, cluster 1 means: the character sequence
|
||||
<literal>BCD</literal> is represented by glyphs
|
||||
@ -319,18 +431,24 @@
|
||||
0,1,2,3,4
|
||||
</programlisting>
|
||||
<para>
|
||||
If <literal>D</literal> is reordered to before <literal>B</literal>,
|
||||
then HarfBuzz merges the <literal>B</literal>,
|
||||
<literal>C</literal>, and <literal>D</literal> clusters, and we
|
||||
get:
|
||||
If <literal>D</literal> is reordered to the position immediately
|
||||
before <literal>B</literal>, then HarfBuzz merges the
|
||||
<literal>B</literal>, <literal>C</literal>, and
|
||||
<literal>D</literal> clusters — all the clusters between
|
||||
the final position of the reordered glyph and its original
|
||||
position. This means that we get:
|
||||
</para>
|
||||
<programlisting>
|
||||
A,D,B,C,E
|
||||
0,1,1,1,4
|
||||
</programlisting>
|
||||
<para>
|
||||
This is clearly not ideal, but it is the only sensible way to
|
||||
maintain a monotonic sequence of cluster values and retain the
|
||||
as the final cluster sequence.
|
||||
</para>
|
||||
<para>
|
||||
Merging this many clusters is not ideal, but it is the only
|
||||
sensible way for HarfBuzz to maintain the guarantee that the
|
||||
sequence of cluster values remains monotonic and to retain the
|
||||
true relationship between glyphs and characters.
|
||||
</para>
|
||||
</section>
|
||||
@ -340,8 +458,9 @@
|
||||
The preceding examples demonstrate the main effects of using
|
||||
cluster levels 0 and 1. The only difference between the two
|
||||
levels is this: in level 0, at the very beginning of the shaping
|
||||
process, HarfBuzz also merges clusters between any base character
|
||||
and all Unicode marks (combining or not) that follow it.
|
||||
process, HarfBuzz merges the cluster of each base character
|
||||
with the clusters of all Unicode marks (combining or not) and
|
||||
modifiers that follow it.
|
||||
</para>
|
||||
<para>
|
||||
For example, let us start with the following character sequence
|
||||
@ -361,6 +480,10 @@
|
||||
A,acute,B
|
||||
0,0 ,2
|
||||
</programlisting>
|
||||
<para>
|
||||
This merger is performed before any other script-shaping
|
||||
steps.
|
||||
</para>
|
||||
<para>
|
||||
This initial cluster merging is the default behavior of the
|
||||
Windows shaping engine, and the old HarfBuzz codebase copied
|
||||
@ -368,9 +491,10 @@
|
||||
remained the default behavior in the new HarfBuzz codebase.
|
||||
</para>
|
||||
<para>
|
||||
But this initial cluster-merging behavior makes it impossible to
|
||||
But this initial cluster-merging behavior makes it impossible
|
||||
client programs to implement some features (such as to
|
||||
color diacritic marks differently from their base
|
||||
characters. That is why, in level 1, HarfBuzz does not perform
|
||||
characters). That is why, in level 1, HarfBuzz does not perform
|
||||
the initial merging step.
|
||||
</para>
|
||||
<para>
|
||||
@ -378,29 +502,34 @@
|
||||
perform cursor positioning, level 0 is more convenient. But
|
||||
relying on cluster boundaries for cursor positioning is wrong: cursor
|
||||
positions should be determined based on Unicode grapheme
|
||||
boundaries, not on shaping-cluster boundaries. As such, level 1
|
||||
clusters are preferred.
|
||||
boundaries, not on shaping-cluster boundaries. As such, using
|
||||
level 1 clustering behavior is recommended.
|
||||
</para>
|
||||
<para>
|
||||
One last note about levels 0 and 1. HarfBuzz currently does not allow a
|
||||
<literal>MultipleSubst</literal> lookup to replace a glyph with zero
|
||||
glyphs (in other words, to delete a glyph). But, in some other situations,
|
||||
glyphs can be deleted. In those cases, if the glyph being deleted is
|
||||
the last glyph of its cluster, HarfBuzz makes sure to merge the cluster
|
||||
with a neighboring cluster.
|
||||
One final facet of levels 0 and 1 is worth noting. HarfBuzz
|
||||
currently does not allow any
|
||||
<emphasis>multiple-substitution</emphasis> GSUB lookups to
|
||||
replace a glyph with zero glyphs (in other words, to delete a
|
||||
glyph).
|
||||
</para>
|
||||
<para>
|
||||
But, in some other situations, glyphs can be deleted. In
|
||||
those cases, if the glyph being deleted is the last glyph of its
|
||||
cluster, HarfBuzz makes sure to merge the deleted glyph's
|
||||
cluster with a neighboring cluster.
|
||||
</para>
|
||||
<para>
|
||||
This is done primarily to make sure that the starting cluster of the
|
||||
text always has the cluster index pointing to the start of the text
|
||||
for the run; more than one client currently relies on this
|
||||
for the run; more than one client program currently relies on this
|
||||
guarantee.
|
||||
</para>
|
||||
<para>
|
||||
Incidentally, Apple's CoreText does something else to maintain the
|
||||
same promise: it inserts a glyph with id 65535 at the beginning of
|
||||
the glyph string if the glyph corresponding to the first character
|
||||
in the run was deleted. HarfBuzz might do something similar in the
|
||||
future.
|
||||
Incidentally, Apple's CoreText does something different to
|
||||
maintain the same promise: it inserts a glyph with id 65535 at
|
||||
the beginning of the glyph string if the glyph corresponding to
|
||||
the first character in the run was deleted. HarfBuzz might do
|
||||
something similar in the future.
|
||||
</para>
|
||||
</section>
|
||||
<section id="level-2">
|
||||
@ -415,16 +544,39 @@
|
||||
performs no merging of clusters whatsoever.
|
||||
</para>
|
||||
<para>
|
||||
When glyphs form a ligature (or when some other feature
|
||||
substitutes multiple glyphs with one glyph), the cluster value
|
||||
of the first glyph is retained as the cluster value for the
|
||||
ligature. However, no subsequent clusters — including
|
||||
marks and modifiers — are affected.
|
||||
This means that there is no initial base-and-mark merging step
|
||||
(as is done in level 0), and it means that reordering moves and
|
||||
ligature substitutions do not trigger a cluster merge.
|
||||
</para>
|
||||
<para>
|
||||
Level 2 cluster behavior is less complex than level 0 or level
|
||||
1, but there are a few cases in which processing cluster values
|
||||
produced at level 2 may be tricky.
|
||||
Only one shaping operation directly affects clusters when using
|
||||
level 2:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
When a cluster <emphasis>decomposes</emphasis>, all of the
|
||||
resulting child clusters inherit as their cluster value the
|
||||
cluster value of the parent cluster.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
When glyphs do form a ligature (or when some other feature
|
||||
substitutes multiple glyphs with one glyph) the cluster value
|
||||
of the first glyph is retained as the cluster value for the
|
||||
resulting ligature.
|
||||
</para>
|
||||
<para>
|
||||
This occurrence sounds similar to a cluster merge, but it is
|
||||
different. In particular, no subsequent characters —
|
||||
including marks and modifiers — are affected. They retain
|
||||
their previous cluster values.
|
||||
</para>
|
||||
<para>
|
||||
Level 2 cluster behavior is ultimately less complex than level 0
|
||||
or level 1, but there are several cases for which processing
|
||||
cluster values produced at level 2 may be tricky.
|
||||
</para>
|
||||
<section id="ligatures-with-combining-marks-in-level-2">
|
||||
<title>Ligatures with combining marks in level 2</title>
|
||||
@ -532,10 +684,11 @@
|
||||
<para>
|
||||
There may be other problems encountered with ligatures under
|
||||
level 2, such as if the direction of the text is forced to
|
||||
opposite of its natural direction (for example, left-to-right
|
||||
Arabic). But, generally speaking, these other scenarios are
|
||||
minor corner cases that are too obscure for most client
|
||||
programs to need to worry about.
|
||||
opposite of its natural direction (for example, Arabic text
|
||||
that is forced into left-to-right directionality). But,
|
||||
generally speaking, these other scenarios are minor corner
|
||||
cases that are too obscure for most client programs to need to
|
||||
worry about.
|
||||
</para>
|
||||
</section>
|
||||
</section>
|
||||
|
@ -76,12 +76,41 @@
|
||||
<section>
|
||||
<title>Terminology</title>
|
||||
<variablelist>
|
||||
<?dbfo list-presentation="blocks"?>
|
||||
<varlistentry>
|
||||
<term>script</term>
|
||||
<listitem>
|
||||
<para>
|
||||
In text shaping, a <emphasis>script</emphasis> is a
|
||||
writing system: a set of symbols, rules, and conventions
|
||||
that is used to represent a language or multiple
|
||||
languages.
|
||||
</para>
|
||||
<para>
|
||||
In general computing lingo, the word "script" can also
|
||||
be used to mean an executable program (usually one
|
||||
written in a human-readable programming language). For
|
||||
the sake of clarity, HarfBuzz documents will always use
|
||||
more specific terminology when referring to this
|
||||
meaning, such as "Python script" or "shell script." In
|
||||
all other instances, "script" refers to a writing system.
|
||||
</para>
|
||||
<para>
|
||||
For developers using HarfBuzz, it is important to note
|
||||
the distinction between a script and a language. Most
|
||||
scripts are used to write a variety of different
|
||||
languages, and many languages may be written in more
|
||||
than one script.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>shaper</term>
|
||||
<listitem>
|
||||
<para>
|
||||
In HarfBuzz, a <emphasis>shaper</emphasis> is a
|
||||
handler for a specific script shaping model. HarfBuzz
|
||||
handler for a specific script-shaping model. HarfBuzz
|
||||
implements separate shapers for Indic, Arabic, Thai and
|
||||
Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the
|
||||
Universal Shaping Engine (USE), and a default shaper for
|
||||
@ -95,12 +124,12 @@
|
||||
<listitem>
|
||||
<para>
|
||||
In text shaping, a <emphasis>cluster</emphasis> is a
|
||||
sequence of codepoints that must be handled as an
|
||||
indivisible unit. Clusters can include codepoint
|
||||
sequence of codepoints that must be treated as an
|
||||
indivisible unit. Clusters can include code-point
|
||||
sequences that form a ligature or base-and-mark
|
||||
sequences. Tracking and preserving clusters is important
|
||||
when shaping operations might separate or reorder
|
||||
codepoints.
|
||||
code points.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz provides three cluster
|
||||
@ -111,7 +140,59 @@
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
|
||||
<varlistentry>
|
||||
<term>grapheme</term>
|
||||
<listitem>
|
||||
<para>
|
||||
In linguistics, a <emphasis>grapheme</emphasis> is one
|
||||
of the indivisible units that make up a writing system or
|
||||
script. Often, graphemes are individual symbols (letters,
|
||||
numbers, punctuation marks, logograms, etc.) but,
|
||||
depending on the writing system, a particular grapheme
|
||||
might correspond to a sequence of several Unicode code
|
||||
points.
|
||||
</para>
|
||||
<para>
|
||||
In practice, HarfBuzz and other text-shaping engines
|
||||
are not generally concerned with graphemes. However, it
|
||||
is important for developers using HarfBuzz to recognize
|
||||
that there is a difference between graphemes and shaping
|
||||
clusters (see above). The two concepts may overlap
|
||||
frequently, but there is no guarantee that they will be
|
||||
identical.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>syllable</term>
|
||||
<listitem>
|
||||
<para>
|
||||
In linguistics, a <emphasis>syllable</emphasis> is an
|
||||
a sequence of sounds that makes up a building block of a
|
||||
particular language. Every language has its own set of
|
||||
rules describing what constitutes a valid syllable.
|
||||
</para>
|
||||
<para>
|
||||
For text-shaping purposes, the various definitions of
|
||||
"syllable" are important because script-specific shaping
|
||||
operations may be applied at the syllable level. For
|
||||
example, a reordering rule might specify that a vowel
|
||||
mark be reordered to the beginning of the syllable.
|
||||
</para>
|
||||
<para>
|
||||
Syllables will consist of one or more Unicode code
|
||||
points. The definition of a syllable for a particular
|
||||
writing system might correspond to how HarfBuzz
|
||||
identifies clusters (see above) for the same writing
|
||||
system. However, it is important for developers using
|
||||
HarfBuzz to recognize that there is a difference between
|
||||
syllables and shaping clusters. The two concepts may
|
||||
overlap frequently, but there is no guarantee that they
|
||||
will be identical.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
|
||||
</section>
|
||||
|
@ -126,7 +126,7 @@
|
||||
</para>
|
||||
<para>
|
||||
If you need to build HarfBuzz from source, first put the
|
||||
<program>ragel</program> binary on your
|
||||
<package>ragel</package> binary on your
|
||||
<literal>PATH</literal>, then follow the appveyor CI cmake
|
||||
<ulink
|
||||
url="https://github.com/harfbuzz/harfbuzz/blob/master/appveyor.yml">build
|
||||
@ -229,6 +229,7 @@
|
||||
</para>
|
||||
|
||||
<variablelist>
|
||||
<?dbfo list-presentation="blocks"?>
|
||||
<varlistentry>
|
||||
<term>--with-libstdc++</term>
|
||||
<listitem>
|
||||
|
@ -182,22 +182,23 @@
|
||||
Southeast Asian scripts are also assigned
|
||||
<emphasis>Unicode Indic Syllabic Category</emphasis> (UISC) and
|
||||
<emphasis>Unicode Indic Positional Category</emphasis> (UIPC)
|
||||
property that provides more detailed information needed for
|
||||
properties that provide more detailed information needed for
|
||||
shaping.
|
||||
</para>
|
||||
<para>
|
||||
The UISC property sub-categorizes Letters and Marks according to
|
||||
common script-shaping behaviors. For example, UISC distinguishes
|
||||
between consonant letters, vowel letters, and vowel marks. The
|
||||
UIPC property sub-categorizes Mark codepoints by the visual
|
||||
UIPC property sub-categorizes Mark codepoints by the relative visual
|
||||
position that they occupy (above, below, right, left, or in
|
||||
multiple positions).
|
||||
</para>
|
||||
<para>
|
||||
Some complex scripts require that the text run be split into
|
||||
syllables, and what constitutes a valid syllable in these
|
||||
scripts is specified in regular expressions of the Letter and
|
||||
Mark codepoints that take the UISC and UIPC properties into account.
|
||||
syllables. What constitutes a valid syllable in these
|
||||
scripts is specified in regular expressions, formed from the
|
||||
Letter and Mark codepoints, that take the UISC and UIPC
|
||||
properties into account.
|
||||
</para>
|
||||
|
||||
</section>
|
||||
|
Loading…
Reference in New Issue
Block a user