<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jan 3, 2014 at 5:07 PM, Jonathan Kew <span dir="ltr"><<a href="mailto:jfkthame@googlemail.com" target="_blank">jfkthame@googlemail.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"> <br> In order to do fallback,<br> you need to do character to glyph mapping.<br> </div></blockquote> <br> Not necessarily. You need to know the character repertoire supported by the font, but you may not need to actually map to glyphs. In Firefox, for instance, font fallback is done based on a per-font *bit* map of supported Unicode codepoints. So at the font fallback stage, we know whether the character is present, but do not map it to a glyph.</blockquote> <div><br></div><div>Isn't this introducing an additional step that's not strictly necessary? Presumably the bitmap is constructed from the font's cmap (unless you use the OS/2 ulUnicodeCodeRange*, which wouldn't be 100% accurate). </div> <div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> If the application is<br> expected to do fallback before calling harfbuzz, why does harfbuzz<br> expect chars rather than glyphs as input to the shaping process?<br> </blockquote> <br></div> One good reason, at least, would be that shaping requires harfbuzz to have access to Unicode character properties, and it is not necessarily possible to derive these from glyph IDs.</blockquote><div><br></div><div>Good point. </div> <div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I<br> would have expected there to be some application callback that harfbuzz<br> would call when there is a need to do fallback; the application would<br> use this to tell harfbuzz how to handle this situation for this<br> particular character. Some of these harfbuzz could handle by itself,<br> others would require cooperation between the application and harfbuzz.<br> </blockquote> <br></div> This doesn't fit into the harfbuzz model, where the "unit of work" is a buffer that is shaped with one specified font according to the rules of one script and language system.<br></blockquote><div><br></div> <div>I agree that the "unit of work" has to be one font (in a particular size) in one script/language system/direction. I don't see that as a limitation of harfbuzz, but rather of OpenType. However, I'm not sure that this is necessarily incompatible with the sort of callback scheme I was thinking of.</div> <div><br></div><div>In doing fallback for a particular font/code point, we can distinguish two categories of fallback:</div><div><br></div><div>a) something that replaces the code point by a sequence of one or more code points (possibly with positioning adjustments) in the same font/size as was requested</div> <div><br></div><div>b) anything else</div><div><br></div><div>In the case of (a), the callback can tell harfbuzz what the replacement sequence is, and harfbuzz can then incorporate it into the buffer.</div><div><br></div> <div>In the case of (b), the callback can tell harfbuzz that it needs to be handled as a separate "unit of work". This would cause harfbuzz to shape only up to (but not including) the problematic code point, and return a value to the application indicating where it has stopped.</div> <div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Using compatibility decompositions to provide a fallback rendering would be one of the options such a higher-level component might use. Pushing this down into HB itself seems quite problematic to me, given the huge variety of types of compatibility decompositions, some of which require some kind of additional styling to avoid corrupting the intended meaning of the data.</blockquote> <div><br></div><div>I agree. One important case is the various kinds of spaces (2000-200B). Just using the compatibility decomposition is suboptimal. If I have an em space, I don't want it replaced by just a space, but by a space with a 1 em width.<br> </div><div><br></div><div>James</div></div><br></div></div>