[HarfBuzz] harfbuzz: Branch 'master' - 3 commits
Behdad Esfahbod
behdad at kemper.freedesktop.org
Mon Aug 31 01:54:59 PDT 2015
docs/usermanual-ch01.xml | 115 +++++++++++++++++++++++++++++
docs/usermanual-ch02.xml | 182 +++++++++++++++++++++++++++++++++++++++++++++++
docs/usermanual-ch03.xml | 77 +++++++++++++++++++
docs/usermanual-ch04.xml | 18 ++++
docs/usermanual-ch05.xml | 13 +++
docs/usermanual-ch06.xml | 8 ++
6 files changed, 413 insertions(+)
New commits:
commit c424b41705b50055c7f92b268cf78a2680af73af
Merge: 31594b9 5470e74
Author: Behdad Esfahbod <behdad at behdad.org>
Date: Mon Aug 31 09:53:16 2015 +0100
Merge pull request #129 from simoncozens/docs
First two chapters. More to follow.
commit 5470e744dd264c2dc33437a68d20bcf7c5ffb905
Author: Simon Cozens <simon at simon-cozens.org>
Date: Sat Aug 29 08:21:18 2015 +0100
Current state and skeleton outline
diff --git a/docs/usermanual-ch03.xml b/docs/usermanual-ch03.xml
new file mode 100644
index 0000000..66ec0a8
--- /dev/null
+++ b/docs/usermanual-ch03.xml
@@ -0,0 +1,77 @@
+<sect1 id="buffers-language-script-and-direction">
+ <title>Buffers, language, script and direction</title>
+ <para>
+ The input to Harfbuzz is a series of Unicode characters, stored in a
+ buffer. In this chapter, we'll look at how to set up a buffer with
+ the text that we want and then customize the properties of the
+ buffer.
+ </para>
+ <sect2 id="creating-and-destroying-buffers">
+ <title>Creating and destroying buffers</title>
+ <para>
+ As we saw in our initial example, a buffer is created and
+ initialized with <literal>hb_buffer_create()</literal>. This
+ produces a new, empty buffer object, instantiated with some
+ default values and ready to accept your Unicode strings.
+ </para>
+ <para>
+ Harfbuzz manages the memory of objects that it creates (such as
+ buffers), so you don't have to. When you have finished working on
+ a buffer, you can call <literal>hb_buffer_destroy()</literal>:
+ </para>
+ <programlisting language="C">
+ hb_buffer_t *buffer = hb_buffer_create();
+ ...
+ hb_buffer_destroy(buffer);
+</programlisting>
+ <para>
+ This will destroy the object and free its associated memory -
+ unless some other part of the program holds a reference to this
+ buffer. If you acquire a Harfbuzz buffer from another subsystem
+ and want to ensure that it is not garbage collected by someone
+ else destroying it, you should increase its reference count:
+ </para>
+ <programlisting language="C">
+void somefunc(hb_buffer_t *buffer) {
+ buffer = hb_buffer_reference(buffer);
+ ...
+</programlisting>
+ <para>
+ And then decrease it once you're done with it:
+ </para>
+ <programlisting language="C">
+ hb_buffer_destroy(buffer);
+}
+</programlisting>
+ <para>
+ To throw away all the data in your buffer and start from scratch,
+ call <literal>hb_buffer_reset(buffer)</literal>. If you want to
+ throw away the string in the buffer but keep the options, you can
+ instead call <literal>hb_buffer_clear_contents(buffer)</literal>.
+ </para>
+ </sect2>
+ <sect2 id="adding-text-to-the-buffer">
+ <title>Adding text to the buffer</title>
+ <para>
+ Now we have a brand new Harfbuzz buffer. Let's start filling it
+ with text! From Harfbuzz's perspective, a buffer is just a stream
+ of Unicode codepoints, but your input string is probably in one of
+ the standard Unicode character encodings (UTF-8, UTF-16, UTF-3 )
+ </para>
+ </sect2>
+ <sect2 id="setting-buffer-properties">
+ <title>Setting buffer properties</title>
+ <para>
+ </para>
+ </sect2>
+ <sect2 id="what-about-the-other-scripts">
+ <title>What about the other scripts?</title>
+ <para>
+ </para>
+ </sect2>
+ <sect2 id="customizing-unicode-functions">
+ <title>Customizing Unicode functions</title>
+ <para>
+ </para>
+ </sect2>
+</sect1>
\ No newline at end of file
diff --git a/docs/usermanual-ch04.xml b/docs/usermanual-ch04.xml
new file mode 100644
index 0000000..c469147
--- /dev/null
+++ b/docs/usermanual-ch04.xml
@@ -0,0 +1,18 @@
+<sect1 id="fonts-and-faces">
+ <title>Fonts and faces</title>
+ <sect2 id="using-freetype">
+ <title>Using FreeType</title>
+ <para>
+ </para>
+ </sect2>
+ <sect2 id="using-harfbuzzs-native-opentype-implementation">
+ <title>Using Harfbuzz's native OpenType implementation</title>
+ <para>
+ </para>
+ </sect2>
+ <sect2 id="using-your-own-font-functions">
+ <title>Using your own font functions</title>
+ <para>
+ </para>
+ </sect2>
+</sect1>
\ No newline at end of file
diff --git a/docs/usermanual-ch05.xml b/docs/usermanual-ch05.xml
new file mode 100644
index 0000000..6f50174
--- /dev/null
+++ b/docs/usermanual-ch05.xml
@@ -0,0 +1,13 @@
+<sect1 id="shaping-and-shape-plans">
+ <title>Shaping and shape plans</title>
+ <sect2 id="opentype-features">
+ <title>OpenType features</title>
+ <para>
+ </para>
+ </sect2>
+ <sect2 id="plans-and-caching">
+ <title>Plans and caching</title>
+ <para>
+ </para>
+ </sect2>
+</sect1>
\ No newline at end of file
diff --git a/docs/usermanual-ch06.xml b/docs/usermanual-ch06.xml
new file mode 100644
index 0000000..ca674c0
--- /dev/null
+++ b/docs/usermanual-ch06.xml
@@ -0,0 +1,8 @@
+<sect1 id="glyph-information">
+ <title>Glyph information</title>
+ <sect2 id="names-and-numbers">
+ <title>Names and numbers</title>
+ <para>
+ </para>
+ </sect2>
+</sect1>
\ No newline at end of file
commit f0807654da160bd7ceb9aff5b8338ec0b643171c
Author: Simon Cozens <simon at simon-cozens.org>
Date: Tue Aug 25 19:57:15 2015 +0100
First two chapters. More to follow.
diff --git a/docs/usermanual-ch01.xml b/docs/usermanual-ch01.xml
new file mode 100644
index 0000000..1ee0cbe
--- /dev/null
+++ b/docs/usermanual-ch01.xml
@@ -0,0 +1,115 @@
+<sect1 id="what-is-harfbuzz">
+ <title>What is Harfbuzz?</title>
+ <para>
+ Harfbuzz is a <emphasis>text shaping engine</emphasis>. It solves
+ the problem of selecting and positioning glyphs from a font given a
+ Unicode string.
+ </para>
+ <sect2 id="why-do-i-need-it">
+ <title>Why do I need it?</title>
+ <para>
+ Text shaping is an integral part of preparing text for display. It
+ is a fairly low level operation; Harfbuzz is used directly by
+ graphic rendering libraries such as Pango, and the layout engines
+ in Firefox, LibreOffice and Chromium. Unless you are
+ <emphasis>writing</emphasis> one of these layout engines yourself,
+ you will probably not need to use Harfbuzz - normally higher level
+ libraries will turn text into glyphs for you.
+ </para>
+ <para>
+ However, if you <emphasis>are</emphasis> writing a layout engine
+ or graphics library yourself, you will need to perform text
+ shaping, and this is where Harfbuzz can help you. Here are some
+ reasons why you need it:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ OpenType fonts contain a set of glyphs, indexed by glyph ID.
+ The glyph ID within the font does not necessarily relate to a
+ Unicode codepoint. For instance, some fonts have the letter
+ "a" as glyph ID 1. To pull the right glyph out of
+ the font in order to display it, you need to consult a table
+ within the font (the "cmap" table) which maps
+ Unicode codepoints to glyph IDs. Text shaping turns codepoints
+ into glyph IDs.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Many OpenType fonts contain ligatures: combinations of
+ characters which are rendered together. For instance, it's
+ common for the <literal>fi</literal> combination to appear in
+ print as the single ligature "fi". Whether you should
+ render text as <literal>fi</literal> or "fi" does not
+ depend on the input text, but on the capabilities of the font
+ and the level of ligature application you wish to perform.
+ Text shaping involves querying the font's ligature tables and
+ determining what substitutions should be made.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ While ligatures like "fi" are typographic
+ refinements, some languages <emphasis>require</emphasis> such
+ substitutions to be made in order to display text correctly.
+ In Tamil, when the letter "TTA" (ட) letter is
+ followed by "U" (உ), the combination should appear
+ as the single glyph "டு". The sequence of Unicode
+ characters "டஉ" needs to be rendered as a single
+ glyph from the font - text shaping chooses the correct glyph
+ from the sequence of characters provided.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Similarly, each Arabic character has four different variants:
+ within a font, there will be glyphs for the initial, medial,
+ final, and isolated forms of each letter. Unicode only encodes
+ one codepoint per character, and so a Unicode string will not
+ tell you which glyph to use. Text shaping chooses the correct
+ form of the letter and returns the correct glyph from the font
+ that you need to render.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Other languages have marks and accents which need to be
+ rendered in certain positions around a base character. For
+ instance, the Moldovan language has the Cyrillic letter
+ "zhe" (ж) with a breve accent, like so: ӂ. Some
+ fonts will contain this character as an individual glyph,
+ whereas other fonts will not contain a zhe-with-breve glyph
+ but expect the rendering engine to form the character by
+ overlaying the two glyphs ж and ˘. Where you should draw the
+ combining breve depends on the height of the preceding glyph.
+ Again, for Arabic, the correct positioning of vowel marks
+ depends on the height of the character on which you are
+ placing the mark. Text shaping tells you whether you have a
+ precomposed glyph within your font or if you need to compose a
+ glyph yourself out of combining marks, and if so, where to
+ position those marks.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ If this is something that you need to do, then you need a text
+ shaping engine: you could use Uniscribe if you are using Windows;
+ you could use CoreText on OS X; or you could use Harfbuzz. In the
+ rest of this manual, we are going to assume that you are the
+ implementor of a text layout engine.
+ </para>
+ </sect2>
+ <sect2 id="why-is-it-called-harfbuzz">
+ <title>Why is it called Harfbuzz?</title>
+ <para>
+ Harfbuzz began its life as text shaping code within the FreeType
+ project, (and you will see references to the FreeType authors
+ within the source code copyright declarations) but was then
+ abstracted out to its own project. This project is maintained by
+ Behdad Esfahbod, and named Harfbuzz. Originally, it was a shaping
+ engine for OpenType fonts - "Harfbuzz" is the Persian
+ for "open type".
+ </para>
+ </sect2>
+</sect1>
\ No newline at end of file
diff --git a/docs/usermanual-ch02.xml b/docs/usermanual-ch02.xml
new file mode 100644
index 0000000..f0a161d
--- /dev/null
+++ b/docs/usermanual-ch02.xml
@@ -0,0 +1,182 @@
+<sect1 id="hello-harfbuzz">
+ <title>Hello, Harfbuzz</title>
+ <para>
+ Here's the simplest Harfbuzz that can possibly work. We will improve
+ it later.
+ </para>
+ <orderedlist numeration="arabic">
+ <listitem>
+ <para>
+ Create a buffer and put your text in it.
+ </para>
+ </listitem>
+ </orderedlist>
+ <programlisting language="C">
+ #include <hb.h>
+ hb_buffer_t *buf;
+ buf = hb_buffer_create();
+ hb_buffer_add_utf8(buf, text, strlen(text), 0, strlen(text));
+</programlisting>
+ <orderedlist numeration="arabic">
+ <listitem override="2">
+ <para>
+ Guess the script, language and direction of the buffer.
+ </para>
+ </listitem>
+ </orderedlist>
+ <programlisting language="C">
+ hb_buffer_guess_segment_properties(buf);
+</programlisting>
+ <orderedlist numeration="arabic">
+ <listitem override="3">
+ <para>
+ Create a face and a font, using FreeType for now.
+ </para>
+ </listitem>
+ </orderedlist>
+ <programlisting language="C">
+ #include <hb-ft.h>
+ FT_New_Face(ft_library, font_path, index, &face)
+ hb_font_t *font = hb_ft_font_create(face);
+</programlisting>
+ <orderedlist numeration="arabic">
+ <listitem override="4">
+ <para>
+ Shape!
+ </para>
+ </listitem>
+ </orderedlist>
+ <programlisting>
+ hb_shape(font, buf, NULL, 0);
+</programlisting>
+ <orderedlist numeration="arabic">
+ <listitem override="5">
+ <para>
+ Get the glyph and position information.
+ </para>
+ </listitem>
+ </orderedlist>
+ <programlisting language="C">
+ hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count);
+ hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count);
+</programlisting>
+ <orderedlist numeration="arabic">
+ <listitem override="6">
+ <para>
+ Iterate over each glyph.
+ </para>
+ </listitem>
+ </orderedlist>
+ <programlisting language="C">
+ for (i = 0; i < glyph_count; ++i) {
+ glyphid = glyph_info[i].codepoint;
+ x_offset = glyph_pos[i].x_offset / 64.0;
+ y_offset = glyph_pos[i].y_offset / 64.0;
+ x_advance = glyph_pos[i].x_advance / 64.0;
+ y_advance = glyph_pos[i].y_advance / 64.0;
+ draw_glyph(glyphid, cursor_x + x_offset, cursor_y + y_offset);
+ cursor_x += x_advance;
+ cursor_y += y_advance;
+ }
+</programlisting>
+ <orderedlist numeration="arabic">
+ <listitem override="7">
+ <para>
+ Tidy up.
+ </para>
+ </listitem>
+ </orderedlist>
+ <programlisting language="C">
+ hb_buffer_destroy(buf);
+ hb_font_destroy(hb_ft_font);
+</programlisting>
+ <sect2 id="what-harfbuzz-doesnt-do">
+ <title>What Harfbuzz doesn't do</title>
+ <para>
+ The code above will take a UTF8 string, shape it, and give you the
+ information required to lay it out correctly on a single
+ horizontal (or vertical) line using the font provided. That is the
+ extent of Harfbuzz's responsibility.
+ </para>
+ <para>
+ If you are implementing a text layout engine you may have other
+ responsibilities, that Harfbuzz will not help you with:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Harfbuzz won't help you with bidirectionality. If you want to
+ lay out text with mixed Hebrew and English, you will need to
+ ensure that the buffer provided to Harfbuzz has those
+ characters in the correct layout order. This will be different
+ from the logical order in which the Unicode text is stored. In
+ other words, the user will hit the keys in the following
+ sequence:
+ </para>
+ <programlisting>
+A B C [space] ג ב א [space] D E F
+ </programlisting>
+ <para>
+ but will expect to see in the output:
+ </para>
+ <programlisting>
+ABC אבג DEF
+ </programlisting>
+ <para>
+ This reordering is called <emphasis>bidi processing</emphasis>
+ ("bidi" is short for bidirectional), and there's an
+ algorithm as an annex to the Unicode Standard which tells you how
+ to reorder a string from logical order into presentation order.
+ Before sending your string to Harfbuzz, you may need to apply the
+ bidi algorithm to it. Libraries such as ICU and fribidi can do
+ this for you.
+ </para>
+ <listitem>
+ <para>
+ Harfbuzz won't help you with text that contains different font
+ properties. For instance, if you have the string "a
+ <emphasis>huge</emphasis> breakfast", and you expect
+ "huge" to be italic, you will need to send three
+ strings to Harfbuzz: <literal>a</literal>, in your Roman font;
+ <literal>huge</literal> using your italic font; and
+ <literal>breakfast</literal> using your Roman font again.
+ Similarly if you change font, font size, script, language or
+ direction within your string, you will need to shape each run
+ independently and then output them independently. Harfbuzz
+ expects to shape a run of characters sharing the same
+ properties.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Harfbuzz won't help you with line breaking, hyphenation or
+ justification. As mentioned above, it lays out the string
+ along a <emphasis>single line</emphasis> of, notionally,
+ infinite length. If you want to find out where the potential
+ word, sentence and line break points are in your text, you
+ could use the ICU library's break iterator functions.
+ </para>
+ <para>
+ Harfbuzz can tell you how wide a shaped piece of text is, which is
+ useful input to a justification algorithm, but it knows nothing
+ about paragraphs, lines or line lengths. Nor will it adjust the
+ space between words to fit them proportionally into a line. If you
+ want to layout text in paragraphs, you will probably want to send
+ each word of your text to Harfbuzz to determine its shaped width
+ after glyph substitutions, then work out how many words will fit
+ on a line, and then finally output each word of the line separated
+ by a space of the correct size to fully justify the paragraph.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ As a layout engine implementor, Harfbuzz will help you with the
+ interface between your text and your font, and that's something
+ that you'll need - what you then do with the glyphs that your font
+ returns is up to you. The example we saw above enough to get us
+ started using Harfbuzz. Now we are going to use the remainder of
+ Harfbuzz's API to refine that example and improve our text shaping
+ capabilities.
+ </para>
+ </sect2>
+</sect1>
\ No newline at end of file
More information about the HarfBuzz
mailing list