Hi, I would like to confirm my plan to address a bug with respect to indic language rendering, as it is also related to 'cluster'. In indic language, the GSUB lookups need to be performed on the character cluster. In the present code, the GSUB lookups are happening for the entire glyph sequence causing bugs in Telugu(confirmed) and Kananda (most likely). I have fixed the code to do the same (available at <a href="https://bugzilla.gnome.org/show_bug.cgi?id=579398">https://bugzilla.gnome.org/show_bug.cgi?id=579398</a>). As this caused problems with Firefox rendering, I have kept the fix pending. I would like to know whether there is any better solution for the same. The requirement is to limit the look up buffer length, based on character cluster, as determined during parsing. For example in Telugu, ka+matra 'a'+ sha + halanth+ space (original typing order) the first two belong to one cluster and the last two another cluster. GSUB should be checked for the first two as a unit and the second two as a unit. Presently GSUB is being applied for the entire glyph sequence. Each character is becoming an independent cluster. It should be applied only when based on the parsing, ka+sha+halanth or ka+sha+halanth+halanth is determined as a cluster (after reordering rules are applied) without any other characters in between. Is it possible to achieve at layout level without language specific code, utilizing the current data structures (eg: gproperties)? Note: Halanth is used as joiner between two consonants to form conjugate consonants in indic languages. Regards Arjun <div class="gmail_quote">2010/6/4 Behdad Esfahbod <<a href="mailto:behdad@behdad.org">behdad@behdad.org</a>> <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Hi Jonathan, All of those are planned as per discussion in Reading. It make take a week or more before I get to implementing them though, since it involves quite some shuffling. How does your timeline for these look like? behdad <div><div></div><div class="h5"> On 06/03/2010 09:42 AM, Jonathan Kew wrote: > Hi Behdad, > > As we discussed a bit in Reading, I'd like the handling of the 'cluster' field to be modified so that combining marks retain their original 'cluster' values, unless of course they get ligated with the base or otherwise processed. This will better preserve the association between glyphs and the original text. (We need this in order to identify glyphs such as CGJ in the final buffer.) > > To do this, I think it's necessary to change hb_form_clusters into something like hb_mark_clusters, and have it set a flag in gproperties for the mark glyphs instead of actually changing the cluster field; then hb_buffer_reverse_clusters can use this instead of relying on the cluster value. > > I have not actually created a patch for this yet, as I'm not sure how you want to handle the bits in gproperties. I notice that it looks like only the low 16 bits are currently used; one option might be to split the field into two 16-bit fields, one for "glyph properties" (from GDEF), and one for "character" or "slot" properties, where the combining mark flag based on Unicode category could go. > > (I'd also suggest that "cluster" should be renamed "src_index", but that's a secondary issue.) > > JK > > _______________________________________________ HarfBuzz mailing list <a href="mailto:HarfBuzz@lists.freedesktop.org">HarfBuzz@lists.freedesktop.org</a> <a href="http://lists.freedesktop.org/mailman/listinfo/harfbuzz" target="_blank">http://lists.freedesktop.org/mailman/listinfo/harfbuzz</a> </div></div></blockquote></div>