From 088755f9e654d2ec638dce0c68d523084b9eaf5a Mon Sep 17 00:00:00 2001 From: Nathan Willis Date: Wed, 10 Oct 2018 16:37:29 -0500 Subject: [PATCH] Docs: update usermanual What Is HarfBuzz material. --- docs/usermanual-what-is-harfbuzz.xml | 222 +++++++++++++++++++++------ 1 file changed, 173 insertions(+), 49 deletions(-) diff --git a/docs/usermanual-what-is-harfbuzz.xml b/docs/usermanual-what-is-harfbuzz.xml index 3a98b53f2..6a3ac1029 100644 --- a/docs/usermanual-what-is-harfbuzz.xml +++ b/docs/usermanual-what-is-harfbuzz.xml @@ -11,8 +11,8 @@ HarfBuzz can properly shape all of the world's major writing - systems. It runs on virtually all operating systems and software - platforms, and it supports all of the standard font formats in use + systems. It runs on all major operating systems and software + platforms, and it supports all of the modern font formats in use today.
@@ -41,9 +41,7 @@ The dominant format is OpenType. The OpenType specification defines a series of shaping models for - various scripts (including Indic, Arabic, Hangul, Hebrew, Khmer, - Myanmar, Thai and Lao, Tibetan, and a Universal Shaping Engine - designed to cover other scripts). These shaping models depend on + various scripts from around the world. These shaping models depend on the font including certain features in its GSUB and GPOS tables. @@ -55,12 +53,13 @@ TrueType fonts can also include OpenType shaping features. Alternatively, TrueType fonts can also include Apple Advanced Typography (AAT) tables to implement shaping - support. AAT fonts are generally only found on macOS systems. + support. AAT fonts are generally only found on macOS and iOS systems. Text strings will usually be tagged with a script and language - tag that provide the context for text shaping. Script + tag that provide the context needed to perform text shaping + correctly. The necessary Script and language tags are defined by OpenType. @@ -72,24 +71,25 @@ Text shaping is an integral part of preparing text for display. Before a Unicode sequence can be rendered, the - codepoints in the sequence must be mapped to the glyphs - provided in the font, and the glyphs must be positioned + codepoints in the sequence must be mapped to the corresponding + glyphs provided in the font, and those glyphs must be positioned correctly relative to each other. For many of the scripts supported in Unicode, these steps involve script-specific layout - rules. + rules, including complex joining, reordering, and positioning + behavior. Implementing these rules is the job of the shaping engine. Text shaping is a fairly low-level operation. HarfBuzz is - used directly by graphic rendering libraries such as Pango, as - well as by the layout engines in Firefox, LibreOffice, and - Chromium. Unless you are writing one of - these layout engines yourself, you will probably not need to use - HarfBuzz: normally, lower-level libraries will turn text into - glyphs for you. + used directly by graphical rendering libraries like Pango, as well as by the layout + engines in Firefox, LibreOffice, and Chromium. Unless you are + writing one of these layout engines + yourself, you will probably not need to use HarfBuzz: normally, + lower-level libraries will turn text into glyphs for you. However, if you are writing a layout engine - or graphics library yourself, you will need to perform text + or graphics library yourself, then you will need to perform text shaping, and this is where HarfBuzz can help you. @@ -104,14 +104,15 @@ all other symbols), which are indexed by a glyph ID. - The glyph ID within the font does not necessarily correlate - to a predictable Unicode codepoint. For instance, some fonts - have the letter "a" as glyph ID 1, but many others do - not. To pull the right glyph out of the font in order to - display "a", you need to consult the table inside - the font (the cmap table) that maps Unicode - codepoints to glyph IDs. In other words, text shaping turns - codepoints into glyph IDs. + A particular glyph ID within the font does not necessarily + correlate to a predictable Unicode codepoint. For instance, + some fonts have the letter "a" as glyph ID 1, but + many others do not. In order to retrieve the right glyph + from the font to display "a", you need to consult + the table inside the font (the cmap + table) that maps Unicode codepoints to glyph IDs. In other + words, text shaping turns codepoints into glyph + IDs. @@ -125,7 +126,7 @@ Whether you should render an "f, i" sequence as fi or as "fi" does not - depend on the input text. Rather, it depends on the whether + depend on the input text. Instead, it depends on the whether or not the font includes an "fi" glyph and on the level of ligature application you wish to perform. The font and the amount of ligature application used are under your @@ -195,26 +196,148 @@ right position, you need to consult the table inside the font (the GPOS table) that contains positioning information. - In other words, text shaping tells you whether you have a - precomposed glyph within your font or if you need to compose a - glyph yourself out of combining marks—and, if so, where to - position those marks. + In other words, text shaping tells you whether you + have a precomposed glyph within your font or if you need to + compose a glyph yourself out of combining marks—and, + if so, where to position those marks. - If tasks like these are something that you need to do, then you need a text - shaping engine. You could use Uniscribe if you are writing - Windows software; you could use CoreText on macOS; or you could - use HarfBuzz. - - - In the rest of this manual, we are going to assume that you are the - implementor of a text-layout engine. + If tasks like these are something that you need to do, then you + need a text shaping engine. You could use Uniscribe if you are + writing Windows software; you could use CoreText on macOS; or + you could use HarfBuzz. + + + In the rest of this manual, the text will assume that the reader + is that implementor of a text-layout engine. + +
-
+ +
+ What does HarfBuzz do? + + HarfBuzz provides OpenType text shaping through a cross-platform + C API that accepts sequences of Unicode input text. Currently, + the following OpenType shaping models are supported: + + + + + Indic (covering Devanagari, Bengali, Gujarati, + Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, and + Sinhala) + + + + + Arabic (covering Arabic, N'Ko, Syriac, and Mongolian) + + + + + Thai and Lao + + + + + Khmer + + + + + Myanmar + + + + + + Tibetan + + + + + + Hangul + + + + + + Hebrew + + + + + The Universal Shaping Engine or USE + (covering complex scripts not covered by the above shaping + models) + + + + + A default shaping model for non-complex scripts + (covering Latin, Cyrillic, Greek, Armenian, Georgian, Tifinagh, + and many others) + + + + + Emoji (including emoji modifier sequences, flag sequences, + and ZWJ sequences) + + + + + + In addition to OpenType shaping, HarfBuzz supports the latest + version of Graphite shaping. HarfBuzz currently supports AAT + shaping only on macOS and iOS systems, and in a pass-through + fashion: HarfBuzz hands off AAT support to the system CoreText + library. However, full, built-in AAT support within HarfBuzz is + under development. + + + + HarfBuzz can read and understand TrueType fonts (.ttf), TrueType + collections (.ttc), and OpenType fonts (.otf, including those + fonts that contain TrueType-style outlines and those that + contain PostScript CFF or CFF2 outlines). + + + + HarfBuzz can run on top of the FreeType, CoreText, DirectWrite, + or Uniscribe font renderers. + + + + In addition to its core shaping functionality, HarfBuzz provides + functions for accessing other font features, including optional + GSUB and GPOS OpenType features, as well as + all color-font formats (CBDT, + sbix, COLR/CPAL, and + SVG-OT) and OpenType variable fonts. HarfBuzz + also includes a font-subsetting feature. + + + + HarfBuzz can perform some low-level math-shaping operations, + although it does not currently perform full shaping for + mathematical typesetting. + + + + A suite of command-line utilities is also provided in the + source-code tree, designed to help users test and debug + HarfBuzz's features on real-world fonts and input. + +
+ +
What HarfBuzz doesn't do HarfBuzz will take a Unicode string, shape it, and give you the @@ -223,7 +346,7 @@ extent of HarfBuzz's responsibility. - It is important to note that if you are implementing a + It is important to note that if you are implementing a complete text-layout engine you may have other responsibilities that HarfBuzz will not help you with. For example: @@ -239,13 +362,13 @@ sequence: -A B C [space] ג ב א [space] D E F + A B C [space] ג ב א [space] D E F but will expect to see in the output: -ABC אבג DEF + ABC אבג DEF This reordering is called bidi processing @@ -253,8 +376,9 @@ ABC אבג DEF algorithm as an annex to the Unicode Standard which tells you how to reorder a string from logical order into presentation order. Before sending your string to HarfBuzz, you may need to apply the - bidi algorithm to it. Libraries such as ICU and fribidi can do - this for you. + bidi algorithm to it. Libraries such as ICU and fribidi can do this for you. @@ -304,7 +428,7 @@ ABC אבג DEF returns is up to you.
- +
Why is it called HarfBuzz? @@ -312,9 +436,9 @@ ABC אבג DEF project (and you will see references to the FreeType authors within the source code copyright declarations), but was then extracted out to its own project. This project is maintained by - Behdad Esfahbod, and named HarfBuzz. Originally, it was a shaping - engine for OpenType fonts—"HarfBuzz" is the Persian - for "open type". + Behdad Esfahbod, who named it HarfBuzz. Originally, it was a + shaping engine for OpenType fonts—"HarfBuzz" is + the Persian for "open type".