[Libreoffice] Proofing API Performance

Thu Nov 10 02:19:56 PST 2011

On 09.11.2011 12:05, Tino Didriksen wrote:
> I am making spelling/grammar/hyphenation extensions that query a remote
> service, and have some performance issues that I hope there are existing
> solutions for. In all cases, the extensions must work with "check
> spelling/grammar as you type" enabled.
>
> - How can I limit the request rate or make it smarter?
> Currently, LO seems to call the API for every word (or even letter)
> typed, which is incredibly wasteful as grammar checking only makes sense
> at sentence level. I also don't really want the whole paragraph at each
> call; just the last finished sentence.

As I was involved into the design and implementation of the proof 
reading API deeply, please let me add some hopefully helpful explanations.

You should consider that the proof reading API is a general purpose API 
that must be usable for a wide range of possibly very different checkers 
in arbitrary languages.

It might be that *you* don't want the whole paragraph, but other proof 
readers might want it. There are several possible reasons for that, the 
most simple but important one being that Writer's detection of sentence 
limits might fail sometimes (that's the reason why the proof reading API 
allows the proof reader to overwrite the provided sentence limits and 
return better ones). Besides that, IMHO sometimes even a whole paragraph 
isn't enough (e.g. in case of lists).

I agree that the call frequency might appear too high for grammar 
checkers. But calling the checker only if the current sentence (the 
sentence where the cursor is located) is complete would rise the 
complexity of the code in Writer that interfaces with the proof reader. 
Moreover, some grammar checkers also want to do spell checking (and in 
most cases they do it better than a pure spell checker).

All in all IMHO it seems to be smarter to let the proof reader itself 
decide if a sentence is complete and how it deals with that. The more 
frequent calls don't have a big performance impact because the proof 
reader runs in an own thread, and as long it does not call back into 
Writer (as would be the case if it only checks for sentence 
completeness), the user wouldn't even notice it, especially on machines 
with more than one CPU core where a second CPU usually is not used by 
Writer at all.

> - Why doesn't LO remember the results?
> It draws the squigglies, but it then calls the checker again when
> right-clicking on an error, even if no changes are made in the interim.
> I can cache this in the extension, but it feels like something that
> should be handled in LO itself.

There was a reason for that behavior, but I fail to remember it. I will 
tell you if my memory comes back. :-)

> In general, it feels like "as you type" incurs 50x more calls than
> needed. So if I missed some obvious option toggle or existing solution,
> I'd love to know.

IIRC the call doesn't happen "as you type", but every time a word is 
finished. At least back then it was implemented that way, IIRC.

Regards,
Mathias