[a11y] LibreOffice Calc exposes 2^31 children, freezes on `GetChildren`
Michael Meeks
michael.meeks at collabora.com
Tue Jun 11 09:40:23 UTC 2024
Hi Michael,
Some great questions and points here; forgive my intruding into the space again =)
On 11/06/2024 08:55, Michael Weghorn wrote:
> Limiting children to the to "close to visible" cells sounds like a
> potential approach.
For a performant, caching AT I would argue this is arguably the only feasible option =) But ... the navigation API I mentioned is important.
> However, that would IMHO still need a clear specification on how to
> implement it and how all relevant AT use cases are covered.
Agreed =)
> Some aspects/questions that might need some further consideration:
>
> * How do other interfaces (like AT-SPI Table, TableCell and Selection)
> expose information? Does e.g. the table report it only has 50 rows and
> 30 columns if that's what's visible on screen? Does cell Q227 report a
> row and column index of 0 if it's the first one in the visible area?
I think exposing the whole thing through the Table interface may make some sense; it is clearly a crazy set of cells - and think it's reasonable to blame ATs if they use this interface for doing something silly.
> * In some cases, off-screen children are of interest, e.g. if they are
> contained in the current selection. How should that be handled? (e.g.
> how does the screen reader announce something like "cell A1 to C100
> selected" if cell A1 "doesn't exist" because it's off-screen?
Right; so - I mentioned "near to the screen" - by near; I mean we will probably want a number of things that are navigationally close: eg. "next heading" or somesuch - to lurk around as real & tracked peers. The content of the Navigator headings should prolly always be present in a writer document's object hierarcy IMHO. That should let ATs very quickly enumerate headings, jump focus to them with a simple API etc.
> * Exposing and caching all cells based on visibility means that whenever
> the view port changes, this needs to be actively updated (push
> approach), which comes with a cost (that I can't estimate right now).
> (We currently have that for other modules, see e.g. comment [3] for
That is the case; we do that for writer on page-down of course.
> * How do screen readers implement features like "read the whole row"?
This comes down to the navigation API I mentioned: having a good API to allow continuous screen-reading of large data-sets - with caching pre-loading & fetching along eg. a selection is really useful. Current writer behavior is far from optimal since you need to do something odd to get a simple navigation such as "next page" IIRC.
> Do
> they just read the part of the row that's currently visible on screen
> and leave out the rest? Or do they somehow implement some extra logic
> to retrieve the remaining content?
Extra navigation / enumeration logic I think.
> * Is navigating to an "arbitrary" cell still possible via a11y API, e.g.
> if some screen reader specific table navigation command implements "jump
> to the first/last cell in the table" or "select the current row")?
There should be a widget for this at the top of calc to jump to and select cells - if I were customizing an AT I'd use that.
>> though of course it is then ideal to have some nice navigation API
>> support wrapped around that
>
> What kind of API does that refer to? Existing or new API on the
> platform a11y level that LO (or the toolkits it uses) would then
> implement, or something else? Do you have anything particular in mind?
I was actually somewhat optimistic about the UIA API Navigation API conceptually:
https://learn.microsoft.com/en-us/windows/win32/api/uiautomationcore/nf-uiautomationcore-irawelementproviderfragment-navigate
Although - I'd really suggest that a11y doesn't work against the application, and if navigating - it should allow the AT to scroll the actual visible/view-port to match what is being interrogated.
> I've been told repeatedly that the fact that Writer doesn't expose
> off-screen document content is indeed a problem as it breaks features
> like browse mode/document navigation in NVDA or Orca (see e.g.
> tdf#35652, tdf#137955, tdf#91739, tdf#96492).
I'm not convinced that "can't see it in Accerssisor" is a real problem; what we all want are fast, accurate and helpful ATs for the impaired. Perhaps I mis-understand, but browse-mode seems conceptually similar to navigating by scrolling the current window down a document - without changing the cursor/document focus.
> to look into at some point. My idea so far is to also expose pages on
> the a11y level, which should avoid the problem of a single object (the
> document) having an enormous amount of children due to that.
> If there any general concerns about that, please raise them. :-)
I guess this moves the problem to re-pagination; where we can get 300+ pages re-built for the sake of moving a single paragraph; then again - I guess if we are notifying changes in position on large sets of accessible peers we have a similar problem.
> The feedback I've received from a11y experts so far is that off-screen
> doc content should *generally* be exposed on the a11y level, and
> limiting Calc to not do that with its huge amount of table cells is
> meant to be an exception to the rule in that regard (see e.g. the
> discussion in [2] and tdf#156657).
I really think that's a mistake that will ultimately hurt ATs performance and that we should focus on the end-user use-cases we want to succeed with - rather than having an abstract absolutist pre-conception that we can expose everything in an efficient way =)
> I think it's fair to treat that specially, but (repeating myself here)
> my take is it needs clarity on what's the "correct" way to do
> that, and that's something that would IMHO ideally be clearly
> specified by AT and/or a11y protocol developers in a general
> guideline that app developers can cling to, rather than LO
> inventing something by itself.
Sure - that's important; of course - the a11y space is traditionally tragically under-funded; so I suspect our approach is in itself somewhat influential; but whatever it is it is worth agreeing & writing down.
> If anyone has further thoughts on that, please don't hesitate to share
> them! :-)
Of course; I'm just one viewpoint. My strong feeling is that focusing on things that make it easier to code fast, simple ATs that meet the common use-cases people want is the vital thing; and I really think that trying to re-build arbitrary document models and synchronize them on the other end of a bus - particularly if we require lots of synchronous round-trips to interrogate the content - is not going to fly.
On 11/06/2024 09:49, Michael Weghorn wrote:
>> Otherwise, as long as the underlying platform a11y protocols are
>> pull-based and given the input I've received up to this point, I tend
>> to think that ATs actively querying the tree are primarily responsible
>> for limiting that to a reasonable amount of information, but I'm
>> thankful for any guidance here...
Its a nice hope =) I'd want to create APIs that capture the common things that ATs want to do, make them easy, and really hard to screw up.
But now I shut up ;-) we're working on the web side of this; caching bits in the browser and adding another protocol latency there - and I'm sure we want to be handling a reasonably bounded set of data there =)
Regards,
Michael.
--
michael.meeks at collabora.com <><, CEO Collabora Productivity
(M) +44 7795 666 147 - timezone usually UK / Europe
More information about the LibreOffice
mailing list