[HarfBuzz] A problem in thai shaper

Behdad Esfahbod behdad at behdad.org
Tue Apr 17 10:41:55 PDT 2012


Your HarfBuzz build probably doesn't have glib, and you are not providing any
Unicode functions, so cluster formation fails.  I shall make HB warn boldly if
that happens.

behdad

On 04/17/2012 01:27 PM, datao zhang wrote:
> Hi behdad:
>  
> Thanks your comments.
>  
> I have recheck the cluster value, but i found these values are still (0,1,2).
> I don't why you can get (0,0,0).
>  
> I test it use the following code written by myself:
>  
> ....
> unsigned int uchar[3]
> for(int i = 0; i < 3; i++)
> hb_buffer_add(buffer, uchar[i],1,i);
> hb_buffer_set_direction(mBuffer, HB_DIRECTION_LTR);
> hb_buffer_set_script(mBuffer, HB_SCRIPT_THAI);
> hb_shape(mFont, mBuffer, NULL, 0);
> ....
>  
> After hb_shape(), i see cluster[0] :0  ; cluster[1]: 1; cluster[2]: 2
>  
> Do you have any comments?  whether i make mistake?
>  
> Maybe I use wrong concept, I know the cluster in harfbuzz not used for line
> break, but i think,  as same as the indic, the syllable should have the same
> cluster for thai, isn't it?
>  
> Br,
> Dean
>  
>> Date: Tue, 17 Apr 2012 10:28:15 -0400
>> From: behdad at behdad.org
>> To: dataozhang at hotmail.com
>> Subject: Re: [HarfBuzz] A problem in thai shaper
>>
>> On 04/17/2012 10:26 AM, Behdad Esfahbod wrote:
>> > On 04/17/2012 08:01 AM, datao zhang wrote:
>> >> Hi:
>> >> For Problem 1:
>> >> Example: if I pass the "0x0E01,0x0E34,0x0E48", the intput clusters
>> >> (0,1,2), after shape, the output cluster should be (0,0,0) because the
>> >> syllable can't be broken when line break. But, currently, I find the output
>> >> clusetrs are still (0,1,2).
>> >
>> > First, note that HarfBuzz clusters are not supposed to be used for things like
>> > linebreaking and cursor positioning. So (0,1,2) is totally fine if there are
>> > three separate glyphs representing those characters. And (0,1,2) is exactly
>> > what Uniscribe returns.
>>
>> Err, my bad. Both HarfBuzz and Uniscribe return (0,0,0) for the sequence, so
>> I don't think there's anything to fix here.
>>
>> b
>>
>> > HarfBuzz however returns (0,0,0) for that sequence.
>> > How where you testing? I'm leaning towards trying to match Uniscribe here.
>> > The finer-grained the cluster values are, the better cursor positioning can be
>> > built on top of HarfBuzz.
>> >
>> > behdad
>> >
>> >> Br,
>> >> Dean
>> >>
>> >>> Date: Mon, 16 Apr 2012 21:08:49 -0400
>> >>> From: behdad at behdad.org
>> >>> To: dataozhang at hotmail.com
>> >>> CC: harfbuzz at lists.freedesktop.org
>> >>> Subject: Re: [HarfBuzz] A problem in thai shaper
>> >>>
>> >>> Hi,
>> >>>
>> >>> Thanks for the email. My comments inline.
>> >>>
>> >>> On 04/13/2012 09:41 AM, datao zhang wrote
>> >>>> So I think for the new Thai shaper, the valid composition of “consonant [1
>> >>>> mandatory]+ diacritic vowel [1 optional] + tone mark [1 optional] “
> should be
>> >>>> set as same cluster.
>> >>>
>> >>> I would guess that our generic layer will already take care of this based on
>> >>> canonical combining categories? Do you have a test case that you want to see
>> >>> improved?
>> >>>
>> >>>
>> >>>> Problem 2:
>> >>>>
>> >>>> When there is no consonant exist, the dotted circle should be inserted
> as base
>> >>>> character. The logic should be the first step for the shaping engine to
> find
>> >>>> the invalid combing marks. Refer to
>> >>>> http://www.microsoft.com/typography/otfntdev/thaiot/shaping.aspx#comb
>> >>>
>> >>> Right. We do not handle invalid combining marks yet. That's something I want
>> >>> to do at some point but it's not high priority.
>> >>>
>> >>> behdad



More information about the HarfBuzz mailing list