[PATCH 0/5] Improve text protocol

Weng Xuetian wengxt at gmail.com
Tue Apr 16 10:09:35 PDT 2013


在 2013年4月16日 星期二 10:16:53,Jan Arne Petersen 写道:
> Hi,
> 
> On 04/15/2013 09:14 PM, Bill Spitzak wrote:
> > Jan Arne Petersen wrote:
> >> * Changes offsets to be Unicode character instead of byte based
> > 
> > No, PLEASE DON'T DO THIS!!!
> > 
> > You think you are making things "easier" but you are making it much much
> > harder.
> 
> My main reason was that EFL, IBus and partly GTK+ were using Unicode
> characters as offsets and I did not want to have to specify how to
> handle 'invalid' byte offsets.
> 
> > You may not believe it, but "how many characters are in this
> > UTF-8" will generate dozens of different answers and should never be
> > used as part of a communication api.
> 
> "Unicode characters" is indeed not good enough for a protocol
> specification. I should have written "Unicode code points" instead. But
> even with that we still have the problem with invalid byte sequences. So
> I do not really mind using byte offsets.
> 
> But we still need to think about how to handle invalid byte sequences
> anyways. What do we expect a toolkit to do when text with invalid byte
> sequences is inserted with commit_string? How to handle
> delete_surrounding_text with the byte offsets not matching code points?
> Should the toolkit ignore such requests or should we leave that as
> undefined behavior?

No matter what case, it should be consider as a bug, but, if the character 
inside the text field is already invalid from the very beginning (I have hit 
this case when I develop fcitx's surrounding text and test with konsole), it's 
not the fault of text protocol nor application you're using.

The application really "can" have invalid text sequence, and even that 
happens, it's the valid and expected behavior for application, for example, 
when you opening the file with wrong encoding, or broken text, or any other 
crazy thing.

This complexity should be leave to toolkit, application, or input method 
implementation to consider. Handle this in protocol will bring necessary work 
and limitation.


More information about the wayland-devel mailing list